WEBVTT - OpenAI, Meta Fair Use Bids Risky in Copyright Drama

0:00:12.960 --> 0:00:16.280
<v Speaker 1>Hello, and welcome to the Boats and Verdicts podcast hosted

0:00:16.320 --> 0:00:19.720
<v Speaker 1>by the litigation and policy team at Bloomer Intelligence, an

0:00:19.800 --> 0:00:24.440
<v Speaker 1>investment research platform of Bloomberg LP. Bloomberg Intelligence has five

0:00:24.520 --> 0:00:28.120
<v Speaker 1>hundred analysts and strategists working across the globe and focused

0:00:28.160 --> 0:00:31.440
<v Speaker 1>on all major markets. Our coverage includes over two thousand

0:00:31.480 --> 0:00:34.400
<v Speaker 1>equities and credits, and we have outlooks on more than

0:00:34.520 --> 0:00:39.000
<v Speaker 1>ninety industries and one hundred market indices, currencies and commodities. Now,

0:00:39.000 --> 0:00:43.640
<v Speaker 1>this podcast series examines the intersection of business policy and law.

0:00:43.840 --> 0:00:47.320
<v Speaker 1>I'm Tamlin Basin. I'm an analyst with Bloomberg Intelligence covering

0:00:47.400 --> 0:00:51.960
<v Speaker 1>intellectual property litigation impacting the tech sector. Now, often that

0:00:52.040 --> 0:00:55.400
<v Speaker 1>means a focus on patent litigation, but right now we're

0:00:55.440 --> 0:00:59.120
<v Speaker 1>in a period of a rising tide of copyright litigation

0:00:59.600 --> 0:01:03.800
<v Speaker 1>that could have profound impacts on the singular technology issue

0:01:03.800 --> 0:01:08.120
<v Speaker 1>of the past few years, that being artificial intelligence, more

0:01:08.160 --> 0:01:12.560
<v Speaker 1>specifically generative AI, which is made capable by large language

0:01:12.560 --> 0:01:15.240
<v Speaker 1>models that are trained on massive data sets. Now, the

0:01:15.280 --> 0:01:18.520
<v Speaker 1>sticking point is that much of that training data was copyrighted,

0:01:18.840 --> 0:01:21.080
<v Speaker 1>and we've seen dozens of lawsuits pop up in the

0:01:21.160 --> 0:01:24.200
<v Speaker 1>US brought by rights holders to some of those data

0:01:24.200 --> 0:01:27.800
<v Speaker 1>sets and against the developers of those large language models.

0:01:28.000 --> 0:01:30.560
<v Speaker 1>Today we're going to explore some of those lawsuits, and

0:01:30.600 --> 0:01:32.440
<v Speaker 1>we're going to spend quite a bit of time talking

0:01:32.480 --> 0:01:34.600
<v Speaker 1>about something called fair use. And we're going to be

0:01:34.600 --> 0:01:36.680
<v Speaker 1>doing all of this with the very capable help of

0:01:36.760 --> 0:01:40.640
<v Speaker 1>Jeremy Goldman, partner and co chair of the Emerging Technology

0:01:40.680 --> 0:01:45.200
<v Speaker 1>Group with the law firm Frankfurt Kernat Klein and Seals. Jeremy,

0:01:45.200 --> 0:01:47.240
<v Speaker 1>thanks so much for joining us today. Now let's start

0:01:47.240 --> 0:01:50.840
<v Speaker 1>with fair use. This seems to be the primary defense

0:01:50.880 --> 0:01:53.360
<v Speaker 1>that's going to be put forward by developers of large

0:01:53.480 --> 0:01:56.040
<v Speaker 1>language models. Now, fair use seems like a bit of

0:01:56.080 --> 0:01:58.160
<v Speaker 1>a get out of jail card, and that it doesn't

0:01:58.200 --> 0:02:00.520
<v Speaker 1>mean that the rights holders copyrights weren't in fringe, but

0:02:00.600 --> 0:02:03.640
<v Speaker 1>it does seem to mean that the infringement was justified

0:02:03.920 --> 0:02:07.160
<v Speaker 1>by the end use and geremany, what's someone accused of

0:02:07.200 --> 0:02:10.680
<v Speaker 1>copyright infringement have to show in order to get that

0:02:10.720 --> 0:02:11.560
<v Speaker 1>fair use protection?

0:02:12.040 --> 0:02:14.760
<v Speaker 2>Yeah, well, thanks Tomlin for having me and happy to

0:02:14.800 --> 0:02:17.919
<v Speaker 2>talk about these cases and fair use. You know first,

0:02:18.080 --> 0:02:21.359
<v Speaker 2>just you know it, actually, fair use is encoded in

0:02:21.560 --> 0:02:24.120
<v Speaker 2>the United States Copyright Act in a section called you know,

0:02:24.200 --> 0:02:28.359
<v Speaker 2>seventeen USC. One oh seven, and that section actually says

0:02:28.400 --> 0:02:31.920
<v Speaker 2>that even though you might exercise one of the exclusive

0:02:31.960 --> 0:02:34.160
<v Speaker 2>rights of copyright, even though you might make a copy,

0:02:34.200 --> 0:02:37.800
<v Speaker 2>even though you might make a distribution of somebody's copyright

0:02:37.800 --> 0:02:40.560
<v Speaker 2>protected work, with section one of seven says that even

0:02:40.560 --> 0:02:43.519
<v Speaker 2>if you exercise one of those exclusive rights that make

0:02:43.600 --> 0:02:46.240
<v Speaker 2>up the bundle of sticks that is a copyright, that

0:02:46.360 --> 0:02:49.240
<v Speaker 2>it's not an infringement. It's actually not an infringement, and

0:02:49.280 --> 0:02:51.280
<v Speaker 2>it's fair use. And it just says if it's a

0:02:51.280 --> 0:02:52.800
<v Speaker 2>fair use, And then it says, well, what is a

0:02:52.800 --> 0:02:56.560
<v Speaker 2>fair use? And it says, well, it's a case specific

0:02:57.000 --> 0:03:01.400
<v Speaker 2>question about whether it's fair use. And Congress tells courts

0:03:01.639 --> 0:03:04.680
<v Speaker 2>that we're going to look at four factors to decide

0:03:04.720 --> 0:03:07.600
<v Speaker 2>in each instance, and they're non exclusive, but there's four

0:03:07.639 --> 0:03:10.119
<v Speaker 2>factors that they say courts should look at whether it's

0:03:10.160 --> 0:03:12.400
<v Speaker 2>fair use, and they're all encoded in the statute. We

0:03:12.440 --> 0:03:14.400
<v Speaker 2>can go through the four of them, but that's basically

0:03:14.520 --> 0:03:17.120
<v Speaker 2>in all the cases involving fair use, courts will look

0:03:17.360 --> 0:03:20.720
<v Speaker 2>and examine each of the four factors, weigh them, and

0:03:20.840 --> 0:03:24.560
<v Speaker 2>then decide whether on balance those factors tilt in favor

0:03:24.600 --> 0:03:26.440
<v Speaker 2>of it being a fair use or not being a

0:03:26.440 --> 0:03:26.880
<v Speaker 2>fair use.

0:03:27.200 --> 0:03:29.919
<v Speaker 1>Yeah, and I think that's excellent important that this is condified.

0:03:30.000 --> 0:03:32.800
<v Speaker 1>But at the same time, it's very much is a

0:03:32.800 --> 0:03:35.400
<v Speaker 1>doctrine that has evolved through case law. I think it's

0:03:35.400 --> 0:03:37.960
<v Speaker 1>fair to say. I mean, there's there's been a number

0:03:38.000 --> 0:03:41.480
<v Speaker 1>of high profile cases in fair use, both in the

0:03:41.480 --> 0:03:43.960
<v Speaker 1>circuit courts and of course at the Supreme Court, but

0:03:44.080 --> 0:03:46.160
<v Speaker 1>probably relevant to what we're talking about today, to the

0:03:46.200 --> 0:03:48.800
<v Speaker 1>development of large language models, there's probably a few to

0:03:48.960 --> 0:03:51.920
<v Speaker 1>the standout, and I think one of those was the

0:03:52.080 --> 0:03:55.160
<v Speaker 1>Google Books litigation. Can you maybe tell us what that

0:03:55.280 --> 0:03:58.040
<v Speaker 1>litigation was about and why there might be some parallels

0:03:58.040 --> 0:03:59.680
<v Speaker 1>to these large language models.

0:04:00.560 --> 0:04:02.400
<v Speaker 2>Sure, and you know, I'm happy to talk about the

0:04:02.400 --> 0:04:05.280
<v Speaker 2>Google Books litigation. I'll also mention that I was a

0:04:05.320 --> 0:04:08.120
<v Speaker 2>litigator that was that worked on those cases, and in

0:04:08.160 --> 0:04:11.160
<v Speaker 2>those cases I represented the Author's Guild in lawsuits that

0:04:11.200 --> 0:04:14.640
<v Speaker 2>were brought against both Google and against a consortium of

0:04:14.720 --> 0:04:17.679
<v Speaker 2>libraries that worked with Google and connection with the Google

0:04:17.720 --> 0:04:21.440
<v Speaker 2>Books program. My views today are just talking as a

0:04:21.480 --> 0:04:24.000
<v Speaker 2>lawyer to appinn on sort of the decisions that came

0:04:24.000 --> 0:04:26.600
<v Speaker 2>out in those cases, and certainly don't represent the views

0:04:26.600 --> 0:04:28.960
<v Speaker 2>of the author's guild, or the views of any parties

0:04:28.960 --> 0:04:30.720
<v Speaker 2>in those cases, or even the views of my law firm.

0:04:30.920 --> 0:04:32.720
<v Speaker 2>But I am happy, and I'm very familiar with those

0:04:32.760 --> 0:04:35.240
<v Speaker 2>cases and those decisions. So with that disclaimer, let me

0:04:35.240 --> 0:04:38.479
<v Speaker 2>just talk about those cases. Google made its mission to

0:04:38.560 --> 0:04:41.920
<v Speaker 2>make the entire world searchable, right, That was the goal

0:04:41.960 --> 0:04:44.280
<v Speaker 2>of Google was to make the entire world searchable. They

0:04:44.320 --> 0:04:46.880
<v Speaker 2>started with the Worldwide Web, of course, and you know,

0:04:46.920 --> 0:04:49.240
<v Speaker 2>you put into Google and put in search results and

0:04:49.320 --> 0:04:51.960
<v Speaker 2>get results back from the Web. And to do that

0:04:52.000 --> 0:04:54.279
<v Speaker 2>they had to copy, you know, essentially the entirety of

0:04:54.320 --> 0:04:57.000
<v Speaker 2>the World Wide Web in order for that to happen. Well,

0:04:57.040 --> 0:05:00.160
<v Speaker 2>Google decided that, you know, maybe that's not enough, and

0:05:00.279 --> 0:05:02.880
<v Speaker 2>we also want to make all of the world's knowledge,

0:05:02.880 --> 0:05:05.680
<v Speaker 2>including all of the books of the world, searchable. So

0:05:05.720 --> 0:05:09.720
<v Speaker 2>in order to do that, Google partnered with major university

0:05:09.760 --> 0:05:13.479
<v Speaker 2>libraries around the country, and they entered into agreements with

0:05:13.520 --> 0:05:17.080
<v Speaker 2>them to go to the libraries and take all of

0:05:17.120 --> 0:05:20.240
<v Speaker 2>their books, millions upon millions of books, many many of

0:05:20.279 --> 0:05:24.120
<v Speaker 2>which were protected by copyright still, and they set up

0:05:24.160 --> 0:05:28.000
<v Speaker 2>scanning stations and they scanned and reproduced and made copies

0:05:28.200 --> 0:05:31.279
<v Speaker 2>of millions upon millions of books in order to digitize

0:05:31.320 --> 0:05:34.479
<v Speaker 2>them and then run them through an optical character recognition

0:05:34.520 --> 0:05:37.719
<v Speaker 2>process and OCR process, and then ultimately for the purpose

0:05:37.839 --> 0:05:40.919
<v Speaker 2>of creating a search index so that when you do searches,

0:05:40.960 --> 0:05:43.920
<v Speaker 2>either on books dot Google dot com or even on

0:05:43.960 --> 0:05:46.600
<v Speaker 2>the main Google search engine, if you find keywords that

0:05:46.720 --> 0:05:49.400
<v Speaker 2>hit on words within the books, you would be able

0:05:49.400 --> 0:05:52.240
<v Speaker 2>to find those books. And what Google was doing was

0:05:52.240 --> 0:05:55.640
<v Speaker 2>displaying a snippet of where the keywords were located within

0:05:55.680 --> 0:05:57.080
<v Speaker 2>a book, and so you'd see that, oh, it was

0:05:57.080 --> 0:05:59.400
<v Speaker 2>in this book on this page or whatever. And the

0:05:59.440 --> 0:06:02.200
<v Speaker 2>other side the university libraries, as part of the deal,

0:06:02.480 --> 0:06:05.680
<v Speaker 2>they also got access to this search index, and there

0:06:05.720 --> 0:06:08.080
<v Speaker 2>was a website called hoti Trust where you could put

0:06:08.080 --> 0:06:10.640
<v Speaker 2>in keywords and they didn't show snippets, but they did

0:06:10.760 --> 0:06:13.800
<v Speaker 2>show that in this book on this page, you'd be

0:06:13.800 --> 0:06:17.440
<v Speaker 2>able to find that information. The university libraries also decided

0:06:17.440 --> 0:06:18.960
<v Speaker 2>that as part of this process, they were going to

0:06:19.000 --> 0:06:23.000
<v Speaker 2>start making certain works that were quote orphan works available

0:06:23.560 --> 0:06:27.040
<v Speaker 2>to members of the university library to make them available

0:06:27.080 --> 0:06:30.440
<v Speaker 2>online well authors around the country. Indeed, authors around the

0:06:30.480 --> 0:06:34.839
<v Speaker 2>world were not happy that Google was digitizing and scanning

0:06:34.920 --> 0:06:39.479
<v Speaker 2>and making use potentially for commercial purposes through Google, millions

0:06:39.480 --> 0:06:42.520
<v Speaker 2>and millions of books without permission. And you know, books

0:06:42.560 --> 0:06:44.960
<v Speaker 2>being at sort of the core of what copyright protects,

0:06:45.240 --> 0:06:47.479
<v Speaker 2>protects the rights of authors, it protects the exclusive rights

0:06:47.480 --> 0:06:50.479
<v Speaker 2>of authors. And they were felt that this was a

0:06:50.520 --> 0:06:54.360
<v Speaker 2>violation of their copyright, and so they filed lawsuits against Google.

0:06:54.400 --> 0:06:58.000
<v Speaker 2>They filed lawsuits against the university libraries alleging that the

0:06:58.120 --> 0:07:02.440
<v Speaker 2>digitization and scanning of the books was copyright infringemen. And

0:07:02.520 --> 0:07:05.800
<v Speaker 2>in those cases, Google and the university libraries defended themselves

0:07:05.839 --> 0:07:09.080
<v Speaker 2>primarily on the ground that, yes, we made copies, but

0:07:09.120 --> 0:07:13.280
<v Speaker 2>those copies were a fair use, and they argued that

0:07:13.360 --> 0:07:17.640
<v Speaker 2>it was transformative that books were written and published for

0:07:17.720 --> 0:07:20.320
<v Speaker 2>one purpose, and we were using them for quite a

0:07:20.320 --> 0:07:23.600
<v Speaker 2>different purpose, which was to make them searchable. Also, they said,

0:07:24.080 --> 0:07:26.360
<v Speaker 2>to make them available for people that were blind was

0:07:26.360 --> 0:07:29.080
<v Speaker 2>another example of what they were doing. And the court,

0:07:29.200 --> 0:07:31.800
<v Speaker 2>ultimately the court the case went up to the Second

0:07:31.880 --> 0:07:35.440
<v Speaker 2>Circuit Court of Appeals, and the court found that indeed

0:07:35.480 --> 0:07:38.720
<v Speaker 2>it was a fair use and held that this digitization

0:07:39.200 --> 0:07:42.640
<v Speaker 2>for the purpose of making it searchable was a fair use.

0:07:43.120 --> 0:07:45.480
<v Speaker 1>Yeah, and I think you sort of alluded to it

0:07:45.520 --> 0:07:47.800
<v Speaker 1>then and Ultimately it did go up to the Second Circuit,

0:07:47.800 --> 0:07:50.760
<v Speaker 1>but this was by no means a swift resolution of

0:07:50.800 --> 0:07:53.080
<v Speaker 1>a case. I recall it going on for years and years.

0:07:53.120 --> 0:07:57.680
<v Speaker 1>There was potential settlements that I believe were ultimately rejected

0:07:57.960 --> 0:08:00.880
<v Speaker 1>by the court. But I think what to some extent,

0:08:00.920 --> 0:08:04.360
<v Speaker 1>what they showed is that these can be very difficult issues.

0:08:04.600 --> 0:08:08.120
<v Speaker 1>It was to sort of resolving court. Their use is

0:08:08.200 --> 0:08:11.320
<v Speaker 1>kind of a squishy content. There's very few bright line

0:08:11.360 --> 0:08:13.120
<v Speaker 1>rules totally.

0:08:13.200 --> 0:08:16.440
<v Speaker 2>And i'd also at at a high level, you know,

0:08:17.240 --> 0:08:19.960
<v Speaker 2>that case sort of not just on the legal points,

0:08:20.000 --> 0:08:23.160
<v Speaker 2>does it have analogies to the current cases that authors

0:08:23.160 --> 0:08:26.000
<v Speaker 2>and creators are filing against the AI companies, but on

0:08:26.080 --> 0:08:29.800
<v Speaker 2>a on a feeling level, on an icky level, there's

0:08:29.840 --> 0:08:32.800
<v Speaker 2>a very similar sentiment where there was a sentiment that

0:08:33.320 --> 0:08:37.800
<v Speaker 2>you know, authors who are really you know, the Copyright

0:08:37.800 --> 0:08:41.400
<v Speaker 2>Act itself and indeed in the Constitution, it talks about

0:08:41.400 --> 0:08:44.920
<v Speaker 2>giving authors exclusive rights over their works for limited periods

0:08:44.920 --> 0:08:47.360
<v Speaker 2>in order to promote progressive science in the art. That's

0:08:47.400 --> 0:08:51.280
<v Speaker 2>the constitutional mandate of the Copyright Act, and authors, you know, felt,

0:08:51.360 --> 0:08:55.200
<v Speaker 2>here's this technology company coming in and scanning and digitizing

0:08:55.280 --> 0:08:58.920
<v Speaker 2>and making use and profiting off of our work and

0:08:58.960 --> 0:09:01.360
<v Speaker 2>they're claiming that it's use and that just seems wrong.

0:09:01.640 --> 0:09:03.480
<v Speaker 2>And I feel like there's the same kind of sentiment

0:09:03.559 --> 0:09:05.720
<v Speaker 2>now with the you know, the AI cases that we'll

0:09:05.720 --> 0:09:06.160
<v Speaker 2>get into.

0:09:06.440 --> 0:09:09.840
<v Speaker 1>Yeah, I think that's absolutely right. And I guess just

0:09:09.880 --> 0:09:11.680
<v Speaker 1>to drill down a little bit further on sort of

0:09:11.720 --> 0:09:14.720
<v Speaker 1>you mentioned it right there, the transformative use. I believe

0:09:14.720 --> 0:09:19.080
<v Speaker 1>that's ultimately how Google was decided. But that's only one

0:09:19.320 --> 0:09:22.000
<v Speaker 1>kind of of the fair use factors. It's sort of,

0:09:22.040 --> 0:09:23.760
<v Speaker 1>I think, kind of baked in the and and the

0:09:23.760 --> 0:09:26.120
<v Speaker 1>fairs for you factor you want to really quickly sort

0:09:26.120 --> 0:09:28.920
<v Speaker 1>of run us through those fair use factors.

0:09:28.840 --> 0:09:30.640
<v Speaker 2>Let me run through the fair use factors, and also

0:09:30.720 --> 0:09:34.080
<v Speaker 2>in doing so, we'll talk about transformative use, because indeed

0:09:34.400 --> 0:09:39.199
<v Speaker 2>that word of transformativeness doesn't appear in the fair use factors.

0:09:39.240 --> 0:09:43.079
<v Speaker 2>That's a judicial creation. It's a creation really of a judge.

0:09:43.160 --> 0:09:45.840
<v Speaker 2>Judge levolved in from a Harvard Law Review article that

0:09:45.880 --> 0:09:47.880
<v Speaker 2>he wrote about fair use, and then the Supreme Court

0:09:47.920 --> 0:09:49.760
<v Speaker 2>picked it up in a case involving a two life

0:09:49.760 --> 0:09:51.920
<v Speaker 2>cruse song. I won't go too far down that road,

0:09:51.920 --> 0:09:54.360
<v Speaker 2>but just to just to tick through the factors, okay,

0:09:54.520 --> 0:09:55.680
<v Speaker 2>I don't have it in front of me, but I

0:09:55.679 --> 0:09:57.720
<v Speaker 2>think I know it by hard, right, Which is the

0:09:57.760 --> 0:10:01.840
<v Speaker 2>first factor is the purpose at and character of the use.

0:10:02.880 --> 0:10:05.800
<v Speaker 2>The second factor is the nature of the work and

0:10:05.840 --> 0:10:07.600
<v Speaker 2>the nature of the work. And well, we'll go back

0:10:07.600 --> 0:10:09.400
<v Speaker 2>to the first factor because it's kind of the most important,

0:10:09.400 --> 0:10:10.959
<v Speaker 2>So I'll go through the four and then we'll go

0:10:11.000 --> 0:10:13.400
<v Speaker 2>back to the first. So purpose and character of the use.

0:10:13.600 --> 0:10:15.840
<v Speaker 2>The second one is the nature of the copyrighted work,

0:10:15.880 --> 0:10:17.800
<v Speaker 2>which is really like how close is it to the

0:10:17.800 --> 0:10:19.960
<v Speaker 2>core of copyright protection? And then you could think of

0:10:20.000 --> 0:10:23.160
<v Speaker 2>like a fictional work versus a nonfictional work. Copyright doesn't

0:10:23.160 --> 0:10:27.360
<v Speaker 2>protect ideas, so copyright will be more steadfast and protecting

0:10:27.400 --> 0:10:30.240
<v Speaker 2>a fictional work or a highly creative illustration, for example,

0:10:30.280 --> 0:10:33.280
<v Speaker 2>right versus something like an encyclopedia. The third factor is

0:10:33.280 --> 0:10:35.959
<v Speaker 2>the amount of substantiality of the use, right, how much

0:10:36.000 --> 0:10:37.800
<v Speaker 2>of it was used in relation to the whole sort

0:10:37.800 --> 0:10:40.800
<v Speaker 2>of a quantitative question. And then the fourth is the

0:10:41.280 --> 0:10:44.280
<v Speaker 2>harm of the use on the market. And so that's

0:10:44.320 --> 0:10:46.560
<v Speaker 2>really like a damages thing, like how harmful is this?

0:10:46.960 --> 0:10:51.240
<v Speaker 2>And really courts tend to weigh very strongly. The first factor,

0:10:51.280 --> 0:10:53.880
<v Speaker 2>which is the purpose in character and the fourth factor. Now,

0:10:53.920 --> 0:10:56.000
<v Speaker 2>just going back, like I said, to the first factor,

0:10:56.559 --> 0:10:58.600
<v Speaker 2>when we talk about the purpose of the use, the

0:10:58.640 --> 0:11:01.520
<v Speaker 2>way that this has as courts have sort of dealt

0:11:01.520 --> 0:11:04.880
<v Speaker 2>with this over the years, they've articulated the first factor

0:11:04.920 --> 0:11:07.640
<v Speaker 2>and broken it into two pieces. First, is is it

0:11:07.679 --> 0:11:11.199
<v Speaker 2>a commercial use or a non commercial use? Until recently

0:11:11.640 --> 0:11:14.280
<v Speaker 2>that factor, of that sub factor of factor one, got

0:11:14.400 --> 0:11:18.480
<v Speaker 2>very little, actually very little kind of deference. Right. People

0:11:18.520 --> 0:11:20.640
<v Speaker 2>often think, oh, if it's a nonprofit organization, they're going

0:11:20.679 --> 0:11:22.199
<v Speaker 2>to get a lot of difference. If it's a school,

0:11:22.240 --> 0:11:25.840
<v Speaker 2>if it's an educational institution. Indeed, that does provide some

0:11:26.640 --> 0:11:30.600
<v Speaker 2>it does provide some weight, but ultimately that's very rarely

0:11:30.640 --> 0:11:33.880
<v Speaker 2>been dispositive in a court's analysis about whether it's commercial

0:11:33.960 --> 0:11:36.840
<v Speaker 2>or non commercial with the US is the most important

0:11:36.880 --> 0:11:39.320
<v Speaker 2>factor and what courts have paid the most attention to

0:11:39.440 --> 0:11:42.400
<v Speaker 2>and what starts getting to the core of modern fairies

0:11:42.480 --> 0:11:45.760
<v Speaker 2>jurisprudence is the purpose of the use. And there what

0:11:45.840 --> 0:11:48.760
<v Speaker 2>the Supreme Court has said in this case of originally

0:11:48.760 --> 0:11:52.080
<v Speaker 2>involving the case involving two Life Crew and a kind

0:11:52.120 --> 0:11:54.000
<v Speaker 2>of a parody of what was held to be a

0:11:54.040 --> 0:11:56.720
<v Speaker 2>parody of the song Pretty Woman by Roy Orbison, and

0:11:56.760 --> 0:11:59.600
<v Speaker 2>a dirty version by Two Life Crew. The Court held

0:11:59.600 --> 0:12:03.439
<v Speaker 2>that some like a parody is transformative because it's taking

0:12:03.640 --> 0:12:06.839
<v Speaker 2>the original work and it's commenting on the original and

0:12:06.920 --> 0:12:10.320
<v Speaker 2>shedding new light on the original, sort of adding something new,

0:12:10.800 --> 0:12:14.520
<v Speaker 2>and it is the new work has a new purpose

0:12:14.840 --> 0:12:17.120
<v Speaker 2>from the original. It has transformed the work from what

0:12:17.160 --> 0:12:21.080
<v Speaker 2>it was before into what it was later. And the

0:12:21.120 --> 0:12:23.400
<v Speaker 2>Supreme Court weighed in on that in the mid nineties

0:12:23.440 --> 0:12:26.160
<v Speaker 2>in that Two Life Crew case, and hadn't really weighed

0:12:26.160 --> 0:12:28.520
<v Speaker 2>in on it again until just what last year or

0:12:28.559 --> 0:12:32.360
<v Speaker 2>two years ago when they decided that a work involving

0:12:33.000 --> 0:12:35.120
<v Speaker 2>Andy Warhol, and we can I'm sure, I imagine we'll get

0:12:35.120 --> 0:12:38.320
<v Speaker 2>to that work involving Andy Warhol and Andy Warhol work

0:12:38.640 --> 0:12:41.040
<v Speaker 2>was not held to be fair use. So though I

0:12:41.040 --> 0:12:43.800
<v Speaker 2>should be careful the way I say that the particular

0:12:44.240 --> 0:12:46.040
<v Speaker 2>use of an Andy Warhol work was held out to

0:12:46.080 --> 0:12:50.560
<v Speaker 2>be fair use, But the transformative use test kind of

0:12:50.600 --> 0:12:54.040
<v Speaker 2>controls frequently. And what you see in the cases involving

0:12:54.080 --> 0:12:57.360
<v Speaker 2>fair use is the defendant trying to argue that their

0:12:57.600 --> 0:13:01.880
<v Speaker 2>use is new and transformed against the original one, and

0:13:01.920 --> 0:13:04.880
<v Speaker 2>the plaintiffs saying, no, we did it for this purpose

0:13:04.920 --> 0:13:07.079
<v Speaker 2>and you're using it for the same purpose.

0:13:07.640 --> 0:13:11.120
<v Speaker 1>That's very very well spoken, and also you crushed it

0:13:11.200 --> 0:13:14.880
<v Speaker 1>on those various factors for not having them good job,

0:13:15.120 --> 0:13:17.920
<v Speaker 1>law school training, coming back. So I guess you drew

0:13:17.920 --> 0:13:20.600
<v Speaker 1>a lot of parallels between Google Books and so the

0:13:20.640 --> 0:13:23.400
<v Speaker 1>litigation that's going now with the lms. But I'd like

0:13:23.440 --> 0:13:25.559
<v Speaker 1>to point out a distinction that I've written about, and

0:13:25.840 --> 0:13:30.200
<v Speaker 1>it's that Google Books was not necessarily Google's core business model.

0:13:30.280 --> 0:13:33.600
<v Speaker 1>You mentioned, yes, they wanted to digitize everything, but books

0:13:33.600 --> 0:13:36.120
<v Speaker 1>are really only a small portion of this. I don't

0:13:36.120 --> 0:13:38.880
<v Speaker 1>think it was ever expecting to make massive amounts of

0:13:38.960 --> 0:13:41.440
<v Speaker 1>revenue necessarily from Google Books. It was kind of a

0:13:41.440 --> 0:13:45.040
<v Speaker 1>side project. Obviously implications for authors, but for Google maybe

0:13:45.080 --> 0:13:47.600
<v Speaker 1>not core to the business model. I think that's that's

0:13:47.640 --> 0:13:50.360
<v Speaker 1>different than what we see now with Open AI or

0:13:50.480 --> 0:13:54.920
<v Speaker 1>Anthropic or some other lms. The LLM is the business model,

0:13:55.480 --> 0:13:58.600
<v Speaker 1>and in the US copyright infringement damages they can be massive.

0:13:58.679 --> 0:14:01.880
<v Speaker 1>Of course, you can have for willful infringement. I believe

0:14:01.920 --> 0:14:04.760
<v Speaker 1>it's up to one hundred and fifty thousand dollars per work.

0:14:04.800 --> 0:14:07.880
<v Speaker 1>We're talking about millions and millions of potential works being used.

0:14:08.200 --> 0:14:10.280
<v Speaker 1>So I think this really does potentially oppose sort of

0:14:10.360 --> 0:14:14.120
<v Speaker 1>an existential threat to those business models if the fair

0:14:14.240 --> 0:14:17.080
<v Speaker 1>use defense doesn't hold up. So I think this is

0:14:17.080 --> 0:14:19.840
<v Speaker 1>why that I think it's so important as this field

0:14:19.880 --> 0:14:22.760
<v Speaker 1>is still kind of in its infancy. But let's turn back,

0:14:22.800 --> 0:14:25.280
<v Speaker 1>I think to the Warhol case, and also I think

0:14:26.040 --> 0:14:29.600
<v Speaker 1>the way Warhol was implemented more recently in probably the

0:14:29.600 --> 0:14:33.120
<v Speaker 1>first AI case that we've had, and that being Ross Intelligence.

0:14:33.360 --> 0:14:35.880
<v Speaker 1>Now you wrote about that, and I think in your

0:14:35.920 --> 0:14:39.160
<v Speaker 1>post you said this case helps delineate the boundaries of

0:14:39.240 --> 0:14:42.880
<v Speaker 1>acceptable fair use of a copyrighted material for an AI

0:14:42.960 --> 0:14:45.920
<v Speaker 1>model training. I guess can you tell us why was

0:14:46.080 --> 0:14:49.600
<v Speaker 1>Ross what Ross did outside of those accepted boundaries.

0:14:49.920 --> 0:14:53.200
<v Speaker 2>Well, we don't know what the accepted boundaries are, right,

0:14:53.280 --> 0:14:56.080
<v Speaker 2>we don't know the extended accepted boundaries what we have now,

0:14:56.120 --> 0:14:58.840
<v Speaker 2>And just to give a little context for listeners here

0:14:59.440 --> 0:15:02.640
<v Speaker 2>is you know, you have now upwards of forty cases

0:15:02.920 --> 0:15:06.920
<v Speaker 2>for copyright infringement or copyright adjacent type claims that have

0:15:06.960 --> 0:15:10.800
<v Speaker 2>been asserted against AI models for using copyright protected materials

0:15:10.840 --> 0:15:13.960
<v Speaker 2>without permission and without payment. And you know the you know,

0:15:14.000 --> 0:15:16.440
<v Speaker 2>the big ones and the most famous ones being sort

0:15:16.480 --> 0:15:18.800
<v Speaker 2>of like New York Times filing a lawsuit against Open

0:15:18.840 --> 0:15:21.920
<v Speaker 2>AI or Sarah Silverman filing similar types of lawsuits, and

0:15:22.440 --> 0:15:25.560
<v Speaker 2>those cases are against and it's sort of important to

0:15:25.640 --> 0:15:29.000
<v Speaker 2>understand that for your question to understand the differences of

0:15:29.120 --> 0:15:32.520
<v Speaker 2>the models and why this case is important those cases.

0:15:33.000 --> 0:15:34.880
<v Speaker 2>Let's just take the New York Times versus Open AI.

0:15:35.320 --> 0:15:40.400
<v Speaker 2>That's a case by publishers and authors against a generative

0:15:40.440 --> 0:15:44.320
<v Speaker 2>AI large language model general purpose. That's sort of like

0:15:44.560 --> 0:15:48.640
<v Speaker 2>underlying it is this you know GPT model that's being

0:15:48.680 --> 0:15:52.000
<v Speaker 2>trained on billions and billions of points of training data

0:15:52.560 --> 0:15:55.760
<v Speaker 2>and has kind of limitless purposes. And people can put

0:15:55.760 --> 0:15:58.240
<v Speaker 2>those models to use in all sorts of industries for

0:15:58.280 --> 0:16:01.200
<v Speaker 2>all sorts of purposes. And you know, one application being

0:16:01.280 --> 0:16:05.520
<v Speaker 2>chat cheept for example, the case by Thompson Reuters against

0:16:05.600 --> 0:16:08.280
<v Speaker 2>Ross Intelligence, which was brought way back in in twenty

0:16:08.320 --> 0:16:11.160
<v Speaker 2>twenty actually, so it really predates a lot of this.

0:16:11.680 --> 0:16:14.320
<v Speaker 2>That's a case, you know where Thompson Reuters, which is

0:16:14.360 --> 0:16:17.800
<v Speaker 2>you know, provides among other things, legal research tool West Law,

0:16:17.880 --> 0:16:19.920
<v Speaker 2>which is you know, my go to legal research tool

0:16:19.960 --> 0:16:23.320
<v Speaker 2>of choice, and they filed a lawsuit against Ross Intelligence,

0:16:23.360 --> 0:16:27.400
<v Speaker 2>which is a legal focused AI research tool, an AI

0:16:27.520 --> 0:16:31.640
<v Speaker 2>powered Legal research tool, and they claimed that Ross Intelligence

0:16:31.760 --> 0:16:35.960
<v Speaker 2>used Westlaw's headnotes, which are you know, summaries of points,

0:16:36.040 --> 0:16:39.640
<v Speaker 2>legal points of holdings and what you know. Thomson Reuter says,

0:16:39.680 --> 0:16:42.560
<v Speaker 2>those are copyright protected, and they claimed that Ross Intelligence

0:16:42.680 --> 0:16:47.960
<v Speaker 2>use them to train the Legal Tool AI model and

0:16:48.120 --> 0:16:52.120
<v Speaker 2>claimed that this was copyright infringement. Ross Intelligence argued that, no,

0:16:52.320 --> 0:16:54.800
<v Speaker 2>we don't ever output the West Law headnotes and our

0:16:54.880 --> 0:16:58.120
<v Speaker 2>users transformative and its fair use. And in that case

0:16:58.600 --> 0:17:00.920
<v Speaker 2>the court held that it would not fair use. And

0:17:00.920 --> 0:17:04.360
<v Speaker 2>that decision came out just February eleventh, very recently, and

0:17:04.760 --> 0:17:09.040
<v Speaker 2>it is the first meaty, substantive fair use decision that

0:17:09.400 --> 0:17:12.439
<v Speaker 2>is in this existential question that you talked about in

0:17:12.480 --> 0:17:15.280
<v Speaker 2>the beginning, which is is the use of copyright protected

0:17:15.320 --> 0:17:19.399
<v Speaker 2>materials to train artificial intelligence models copyright infringement or is

0:17:19.440 --> 0:17:22.320
<v Speaker 2>it fair use? And in this particular case, the court

0:17:22.400 --> 0:17:26.359
<v Speaker 2>said that it was not fair use. Very important to understand, though,

0:17:26.720 --> 0:17:29.560
<v Speaker 2>is the particular facts of this case. As we said,

0:17:29.720 --> 0:17:32.840
<v Speaker 2>fair use is a case by case analysis. It's very

0:17:32.880 --> 0:17:38.680
<v Speaker 2>facts specific, and there are important distinctions between the facts

0:17:38.760 --> 0:17:41.600
<v Speaker 2>of the Ross Intelligence case and the facts of cases

0:17:41.680 --> 0:17:45.439
<v Speaker 2>like New York Times versus Open AI whether those and

0:17:45.480 --> 0:17:47.680
<v Speaker 2>we can talk about what those differences are and why

0:17:47.720 --> 0:17:50.760
<v Speaker 2>they might matter, whether those will be enough to change

0:17:50.800 --> 0:17:53.159
<v Speaker 2>the outcome in those cases. As I called it, the

0:17:53.160 --> 0:17:55.919
<v Speaker 2>trillion dollar question. I say trillion dollar because of what

0:17:55.920 --> 0:17:59.960
<v Speaker 2>you said, right, I mean, you're talking about massive copyright damages,

0:18:00.040 --> 0:18:03.280
<v Speaker 2>potential copyright damage is in and an industry that is

0:18:03.440 --> 0:18:05.800
<v Speaker 2>now valued. Probably you're a Bloomberg, you're in a better

0:18:05.800 --> 0:18:08.080
<v Speaker 2>position to tell me. But I imagine it's a trillion

0:18:08.119 --> 0:18:12.080
<v Speaker 2>dollar industry now, and you know the outcomes of those

0:18:12.119 --> 0:18:14.960
<v Speaker 2>rulings could really uh you know, have an impact on that,

0:18:15.119 --> 0:18:17.000
<v Speaker 2>on that industry and the amount of damages that could

0:18:17.040 --> 0:18:17.880
<v Speaker 2>be at stake there.

0:18:18.440 --> 0:18:20.679
<v Speaker 1>Yeah, I mean, I mean AI has certainly been the

0:18:20.800 --> 0:18:23.280
<v Speaker 1>key theme in the investment community for years and years,

0:18:23.280 --> 0:18:26.200
<v Speaker 1>and the evaluations seem to sort of double every every

0:18:26.240 --> 0:18:29.040
<v Speaker 1>few months. So it's a massive amount. But yeah, let's

0:18:29.119 --> 0:18:31.639
<v Speaker 1>let's sort of dive into too. Why I think you

0:18:31.680 --> 0:18:34.600
<v Speaker 1>mentioned Ross was not generative AI. I think I think

0:18:34.640 --> 0:18:37.120
<v Speaker 1>the judge was pretty clear that to draw that distinction.

0:18:37.520 --> 0:18:40.720
<v Speaker 1>But will that necessarily make it it different in how

0:18:41.000 --> 0:18:44.920
<v Speaker 1>Ross applied I guess Warhol. When we do start dealing

0:18:45.000 --> 0:18:47.600
<v Speaker 1>with the generative AI defendants.

0:18:47.800 --> 0:18:50.440
<v Speaker 2>I I don't know. That's that's that's one that I'm

0:18:50.440 --> 0:18:52.359
<v Speaker 2>not going to give you a I don't have a

0:18:52.359 --> 0:18:55.159
<v Speaker 2>crystal ball, and I'm not I'm not even going to

0:18:55.320 --> 0:18:57.720
<v Speaker 2>give you my personal take on that. It's too that's

0:18:57.760 --> 0:19:00.320
<v Speaker 2>too dicey, that's too dicey. Right, what I here's what

0:19:00.359 --> 0:19:02.400
<v Speaker 2>I'll tell you. Let's just talk about why it might

0:19:02.480 --> 0:19:06.480
<v Speaker 2>matter as a why it might be a relevant distinction

0:19:06.640 --> 0:19:08.320
<v Speaker 2>and not what we call in the legal world of

0:19:08.359 --> 0:19:11.280
<v Speaker 2>distinction without a difference, right. And so sometimes you have

0:19:11.359 --> 0:19:14.120
<v Speaker 2>factual differences that don't have any legal bearing, and sometimes

0:19:14.119 --> 0:19:16.000
<v Speaker 2>they have legal bearing, and let's talk about why it

0:19:16.040 --> 0:19:18.639
<v Speaker 2>might matter. Here's an example. And this just goes to

0:19:18.680 --> 0:19:21.199
<v Speaker 2>the fact that the Ross Intelligence model was not a

0:19:21.240 --> 0:19:24.320
<v Speaker 2>generative AI. What that model allowed you to do was,

0:19:24.640 --> 0:19:26.600
<v Speaker 2>you know input. You would put in an input with

0:19:26.640 --> 0:19:30.560
<v Speaker 2>a legal research question and asking, you know, the AI model, hey,

0:19:30.560 --> 0:19:32.159
<v Speaker 2>what's the law on X, Y and Z, and then

0:19:32.160 --> 0:19:34.720
<v Speaker 2>it would come back with relevant judicial opinions, which are

0:19:34.760 --> 0:19:38.840
<v Speaker 2>public domain, and say, here's a judicial opinion that answers

0:19:38.840 --> 0:19:41.240
<v Speaker 2>the relevant model. The point of the West Law headnotes

0:19:41.480 --> 0:19:43.560
<v Speaker 2>why they used it was to help train it to

0:19:43.680 --> 0:19:46.480
<v Speaker 2>understand the way that people talk, because not everyone talks

0:19:46.520 --> 0:19:48.840
<v Speaker 2>like a judge and a judicial opinion. People talk more

0:19:48.880 --> 0:19:51.680
<v Speaker 2>like West Law headnotes than they talk like judges and opinions.

0:19:51.800 --> 0:19:53.600
<v Speaker 2>So they use those West Law headnotes to train the

0:19:53.640 --> 0:19:55.959
<v Speaker 2>AI to talk like a human and not just like

0:19:56.160 --> 0:19:59.840
<v Speaker 2>a lawyer or a judge. So they never generated in

0:20:00.080 --> 0:20:03.520
<v Speaker 2>generate new texts. Contrast that with something like a chat sheept,

0:20:03.800 --> 0:20:08.440
<v Speaker 2>which generates original texts that presumably in the outputs are

0:20:08.480 --> 0:20:11.439
<v Speaker 2>not going to You know, it could, but that's a

0:20:11.440 --> 0:20:15.359
<v Speaker 2>different question. The outputs likely don't infringe the underlying materials

0:20:15.400 --> 0:20:19.640
<v Speaker 2>that we're input into the system. Courts have in cases

0:20:19.720 --> 0:20:22.880
<v Speaker 2>involving fair use looked at things like is the new

0:20:23.040 --> 0:20:28.040
<v Speaker 2>use going to create more content that's creative and new

0:20:28.240 --> 0:20:34.040
<v Speaker 2>and useful and ultimately driving and further to the constitutional

0:20:34.040 --> 0:20:38.040
<v Speaker 2>mandate that I mentioned before in the Constitution, Article one,

0:20:38.920 --> 0:20:44.080
<v Speaker 2>Section eight, Clause eight says to promote the progress of

0:20:44.160 --> 0:20:47.040
<v Speaker 2>science and the useful arts, we're going to give authors

0:20:47.119 --> 0:20:49.520
<v Speaker 2>exclusive rights over their works for limited periods of time.

0:20:49.640 --> 0:20:52.480
<v Speaker 2>That's what copyright is. It's giving exclusive rights to authors

0:20:52.520 --> 0:20:56.360
<v Speaker 2>to promote progress of science in the arts. One thing

0:20:56.359 --> 0:20:58.920
<v Speaker 2>that a court might look at is does generative AI

0:20:59.280 --> 0:21:01.959
<v Speaker 2>help promote progress and science and useful arts by creating

0:21:01.960 --> 0:21:04.560
<v Speaker 2>all of this output that's useful for society. I think,

0:21:04.600 --> 0:21:07.000
<v Speaker 2>you know, another sort of angle on this which isn't

0:21:07.400 --> 0:21:09.840
<v Speaker 2>really about the generative AI, which is more like the

0:21:09.920 --> 0:21:12.640
<v Speaker 2>ultimate use of it. But another factor that plays into

0:21:12.680 --> 0:21:16.960
<v Speaker 2>that and is very important to understand the less about

0:21:16.960 --> 0:21:20.760
<v Speaker 2>the generative AI, which is the application of a large

0:21:20.800 --> 0:21:23.920
<v Speaker 2>language model or the application of the GPT for example,

0:21:23.920 --> 0:21:28.200
<v Speaker 2>that underlies open AIS model or an anthropic equivalent. Right,

0:21:28.600 --> 0:21:32.600
<v Speaker 2>is what data is used to train those models and

0:21:32.640 --> 0:21:36.560
<v Speaker 2>to make them able to do the extraordinary work that

0:21:36.600 --> 0:21:39.800
<v Speaker 2>they do. And what the court found relevant in the

0:21:39.880 --> 0:21:43.440
<v Speaker 2>Raws Intelligence case is that they were using these West

0:21:43.520 --> 0:21:46.520
<v Speaker 2>Law headnotes in order to train it to provide something

0:21:46.520 --> 0:21:49.119
<v Speaker 2>that ultimately was found to be sort of competitive with

0:21:49.600 --> 0:21:52.160
<v Speaker 2>West Law. But what they said was, the court said,

0:21:52.400 --> 0:21:54.919
<v Speaker 2>you know, that's material that you could have gotten a

0:21:54.960 --> 0:21:58.199
<v Speaker 2>license for, right, you could have paid. It's not like

0:21:58.280 --> 0:22:01.720
<v Speaker 2>you're dealing with that large a body of work like

0:22:01.800 --> 0:22:05.320
<v Speaker 2>you could have created your own headnotes. You know, if

0:22:05.400 --> 0:22:09.080
<v Speaker 2>like Thompson Breuter's in the court said this, Thompson Broider's pay,

0:22:09.160 --> 0:22:11.560
<v Speaker 2>you know paid people authors, which is the way it's

0:22:11.560 --> 0:22:13.800
<v Speaker 2>supposed to work under copyright and pay them to go

0:22:13.880 --> 0:22:16.840
<v Speaker 2>and create headnotes, and like they get to protect those

0:22:16.880 --> 0:22:18.800
<v Speaker 2>and you have to pay for them, and you could

0:22:18.880 --> 0:22:22.720
<v Speaker 2>and there's really was nothing preventing getting a license from

0:22:22.760 --> 0:22:25.720
<v Speaker 2>a Lexus Nexus or a West Thought or just hiring

0:22:26.240 --> 0:22:30.639
<v Speaker 2>authors to create their own headnotes. Now it's difficult to see.

0:22:30.720 --> 0:22:33.320
<v Speaker 2>And I think an argument that open AI is going

0:22:33.359 --> 0:22:36.760
<v Speaker 2>to the open AIS anthropics of the world, I believe

0:22:36.880 --> 0:22:39.680
<v Speaker 2>will try to argue in those cases is to say

0:22:40.080 --> 0:22:43.119
<v Speaker 2>we couldn't license all of this stuff that is needed

0:22:43.119 --> 0:22:47.399
<v Speaker 2>to change these large language models. Let's emphasize large. We

0:22:47.480 --> 0:22:50.439
<v Speaker 2>needed to be really large. An argument that I think

0:22:50.480 --> 0:22:52.480
<v Speaker 2>that's going to be made, and whether this is going

0:22:52.560 --> 0:22:55.639
<v Speaker 2>to prevail again, I'm not gonna opine on it, but

0:22:55.680 --> 0:22:59.040
<v Speaker 2>I think the argument will be something like, we need

0:22:59.119 --> 0:23:01.320
<v Speaker 2>all of the line wage of the world, we need

0:23:01.400 --> 0:23:03.800
<v Speaker 2>all of the culture of the world, we need everything

0:23:03.840 --> 0:23:07.280
<v Speaker 2>that's out there, and we couldn't possibly under any circumstances,

0:23:07.359 --> 0:23:11.320
<v Speaker 2>license everything that we need in order to teach these

0:23:11.359 --> 0:23:16.080
<v Speaker 2>models to speak language the second of LLM and to

0:23:16.840 --> 0:23:20.240
<v Speaker 2>understand culture and understand who we are. And so there

0:23:20.320 --> 0:23:23.480
<v Speaker 2>was there was a need and courts look to see

0:23:23.560 --> 0:23:25.240
<v Speaker 2>under the fair use factors like was there really a

0:23:25.359 --> 0:23:28.399
<v Speaker 2>need for what you were taking here without sort of

0:23:28.440 --> 0:23:31.000
<v Speaker 2>a license or without permission? And the argument again it

0:23:31.040 --> 0:23:33.200
<v Speaker 2>is a novel type of argument, and it's also there's

0:23:33.240 --> 0:23:35.360
<v Speaker 2>like a good rejoinder that I see the author's cure make,

0:23:35.400 --> 0:23:37.560
<v Speaker 2>which is like, well, just because you can't have it

0:23:37.600 --> 0:23:40.080
<v Speaker 2>for free, maybe means you don't get it. It doesn't

0:23:40.080 --> 0:23:42.240
<v Speaker 2>mean like you get to just take it, right. But

0:23:42.440 --> 0:23:44.960
<v Speaker 2>I think the argument would be we need everything, we

0:23:45.040 --> 0:23:48.240
<v Speaker 2>need everything, and there's no world and there's no possibility

0:23:48.600 --> 0:23:51.360
<v Speaker 2>of ever having a licensing model in order to get

0:23:51.400 --> 0:23:52.200
<v Speaker 2>everything in the world.

0:23:52.359 --> 0:23:55.800
<v Speaker 1>Yeah, that's a really interesting point, and I think sort

0:23:55.840 --> 0:23:58.479
<v Speaker 1>of again goes to Ross Intelligence where sort of they

0:23:58.480 --> 0:24:00.880
<v Speaker 1>were in some way is sort of vertically focus. They

0:24:00.920 --> 0:24:04.679
<v Speaker 1>were sort of designed to compete with Thompson Reuters in

0:24:04.720 --> 0:24:07.360
<v Speaker 1>this area of headnoe cases. And I think that might

0:24:07.560 --> 0:24:09.000
<v Speaker 1>to the extent that it might have some read through

0:24:09.040 --> 0:24:11.160
<v Speaker 1>to the generative AI companies. It's to some of these

0:24:11.160 --> 0:24:14.560
<v Speaker 1>cases where I also see in some of the complaints

0:24:14.600 --> 0:24:17.840
<v Speaker 1>the potential for that to be similar. I think some

0:24:17.920 --> 0:24:20.960
<v Speaker 1>of the visual artists against some of these AI models

0:24:20.960 --> 0:24:23.760
<v Speaker 1>that also output images that could be licensed. Also, I

0:24:23.760 --> 0:24:26.359
<v Speaker 1>think there's a new one by dal Jones that I

0:24:26.359 --> 0:24:29.760
<v Speaker 1>think also raises in the complaint and least argument that

0:24:29.800 --> 0:24:32.840
<v Speaker 1>you're actually doing this so that you can license sort

0:24:32.840 --> 0:24:35.359
<v Speaker 1>of newsflow news data. So I think that's what we

0:24:35.440 --> 0:24:38.320
<v Speaker 1>might see potential read through. But I think also, as

0:24:38.359 --> 0:24:40.600
<v Speaker 1>you mentioned, I think anybody who says they know how

0:24:40.680 --> 0:24:43.080
<v Speaker 1>this is going to go, there's no possible way. One

0:24:43.160 --> 0:24:45.760
<v Speaker 1>you have of wards of forty cases, they could go

0:24:46.000 --> 0:24:48.760
<v Speaker 1>very different ways. And two, the question is who is

0:24:48.800 --> 0:24:51.399
<v Speaker 1>going to decide fair use? Now Again, in rosson Intelligence,

0:24:51.920 --> 0:24:56.000
<v Speaker 1>it was decided at the summary judgment stage, but at

0:24:56.040 --> 0:24:59.400
<v Speaker 1>first the judge didn't want to decide it at that stage.

0:25:00.240 --> 0:25:02.760
<v Speaker 1>So I think it's a Again we talked about these

0:25:02.960 --> 0:25:06.879
<v Speaker 1>very hard concepts and judges struggle with them, and jer

0:25:06.920 --> 0:25:10.080
<v Speaker 1>Hurry's are certainly going to struggle with them if they

0:25:10.119 --> 0:25:12.440
<v Speaker 1>are given at least some of these very used fatactors.

0:25:12.720 --> 0:25:14.399
<v Speaker 1>Do you think we're going to see sort of maybe

0:25:14.600 --> 0:25:18.840
<v Speaker 1>potentially diverging precedent here, some judges decigdning some juries, assigning

0:25:19.000 --> 0:25:21.199
<v Speaker 1>some circuits, taking different stances on this.

0:25:21.480 --> 0:25:24.080
<v Speaker 2>I think it's very I think that is more likely

0:25:24.119 --> 0:25:27.119
<v Speaker 2>than not, and I think that's not I think that

0:25:27.160 --> 0:25:29.720
<v Speaker 2>it's more likely than not. I also think that in

0:25:29.760 --> 0:25:34.440
<v Speaker 2>general with jurisprudence, when you have multiple cases that deal

0:25:34.520 --> 0:25:38.320
<v Speaker 2>with you know, similar but not identical issues, but typically

0:25:38.400 --> 0:25:43.520
<v Speaker 2>you have is judges try not to reject the reasoning

0:25:43.640 --> 0:25:47.440
<v Speaker 2>of their brethren and sisters in the other courts. Rather,

0:25:47.480 --> 0:25:49.639
<v Speaker 2>what they try to do is to distinguish their cases

0:25:49.680 --> 0:25:52.760
<v Speaker 2>on the facts. And there's a lot of ground for

0:25:52.880 --> 0:25:56.000
<v Speaker 2>doing so in these cases. So if, for example, a

0:25:56.080 --> 0:25:59.480
<v Speaker 2>judge in one of the cases in the New York

0:25:59.480 --> 0:26:01.159
<v Speaker 2>Times case and stop an AI or any of the

0:26:01.200 --> 0:26:04.640
<v Speaker 2>other cases, even if they don't adopt the exact reasoning

0:26:04.760 --> 0:26:08.520
<v Speaker 2>of Judge Bibis who issued this decision in ross Intelligence,

0:26:08.600 --> 0:26:10.800
<v Speaker 2>or if it goes up to the third Circuit of appeals,

0:26:11.200 --> 0:26:13.280
<v Speaker 2>even if they don't agree with the reasoning, what a

0:26:13.359 --> 0:26:15.840
<v Speaker 2>judge will try to do first is say our facts

0:26:15.840 --> 0:26:18.480
<v Speaker 2>are different and so we come out differently. It's more

0:26:18.600 --> 0:26:21.159
<v Speaker 2>likely that courts will do that than to, you know,

0:26:21.520 --> 0:26:24.639
<v Speaker 2>head on, disagree with the underlying reasoning. You know that

0:26:24.800 --> 0:26:27.560
<v Speaker 2>said right now, we just have a district level case.

0:26:28.240 --> 0:26:31.719
<v Speaker 2>It almost certainly I'd be surprised if it doesn't end

0:26:31.800 --> 0:26:35.000
<v Speaker 2>up on appeal. There's every reason to believe that a

0:26:35.000 --> 0:26:37.879
<v Speaker 2>circuit court could disagree with Judge Bbis. Judge Vibas is

0:26:37.960 --> 0:26:41.239
<v Speaker 2>sitting by designation. He's a circuit judge who's sitting by

0:26:41.240 --> 0:26:43.960
<v Speaker 2>designation in the district court. And like you said, just

0:26:44.000 --> 0:26:46.960
<v Speaker 2>a few months ago, he came out arguing that there

0:26:47.080 --> 0:26:48.800
<v Speaker 2>was a good ground for fair use. Some was going

0:26:48.840 --> 0:26:52.160
<v Speaker 2>to allow Ross Intelligence to make its fair use case

0:26:52.400 --> 0:26:55.919
<v Speaker 2>to the jury. And then you know, something changed and

0:26:56.000 --> 0:26:58.000
<v Speaker 2>he decided that it wasn't fair use as a matter

0:26:58.040 --> 0:27:00.719
<v Speaker 2>of law. So these are really close calls. I think

0:27:00.760 --> 0:27:03.000
<v Speaker 2>it's really likely that it goes up to a circuit.

0:27:03.040 --> 0:27:06.919
<v Speaker 2>I think it's very likely that judges in the Ninth Circuit,

0:27:06.960 --> 0:27:09.760
<v Speaker 2>for example, which tends to be fairly liberal around fair use,

0:27:09.800 --> 0:27:12.040
<v Speaker 2>and the Second Circuit, the judges tend to be fairly

0:27:12.080 --> 0:27:14.560
<v Speaker 2>liberal around technology and fair use. It was the Second

0:27:14.560 --> 0:27:17.520
<v Speaker 2>Circuit out of New York that held that Google Books

0:27:17.520 --> 0:27:20.119
<v Speaker 2>was fair use, and it's in the Ninth Circuit that

0:27:20.200 --> 0:27:23.040
<v Speaker 2>covers you know, San Francisco and all of the tech companies.

0:27:24.240 --> 0:27:25.720
<v Speaker 2>That doesn't mean that they're going to hold that this

0:27:25.800 --> 0:27:28.240
<v Speaker 2>is fair use. It does mean that you do have

0:27:28.320 --> 0:27:30.560
<v Speaker 2>a situation where you could end up with circuit splits,

0:27:30.600 --> 0:27:34.439
<v Speaker 2>and you also could no one would be surprised if

0:27:34.480 --> 0:27:37.640
<v Speaker 2>this does end up in the Supreme Court, and whether

0:27:37.680 --> 0:27:41.800
<v Speaker 2>the Supreme Court has the copyright expertise to handle a

0:27:41.840 --> 0:27:45.320
<v Speaker 2>case like this is a real question, and you know,

0:27:45.520 --> 0:27:48.399
<v Speaker 2>something that's creating a lot of uncertainty around what's going

0:27:48.440 --> 0:27:48.800
<v Speaker 2>to happen.

0:27:49.160 --> 0:27:51.720
<v Speaker 1>Yeah, absolutely, and I think Warhol I don't know if

0:27:51.760 --> 0:27:54.720
<v Speaker 1>people were necessarily stunned by the outcome, but I believe it.

0:27:54.720 --> 0:27:56.879
<v Speaker 1>It was seven to two, so it was fairly a

0:27:56.880 --> 0:27:59.520
<v Speaker 1>sizable majority. And now we should point out that often

0:28:00.119 --> 0:28:03.199
<v Speaker 1>IP in general, and also copyright isn't necessarily have the

0:28:03.200 --> 0:28:06.119
<v Speaker 1>partisan vent that a lot of other issues necessarily do.

0:28:06.600 --> 0:28:09.440
<v Speaker 1>But still seven to two was a fairly strong opinion

0:28:09.720 --> 0:28:12.440
<v Speaker 1>and again against fair use in that case. So it's

0:28:12.480 --> 0:28:15.680
<v Speaker 1>definitely an evolving landscape. I would say one thing that

0:28:15.760 --> 0:28:18.720
<v Speaker 1>we haven't necessarily talked about is sort of the input

0:28:18.960 --> 0:28:24.640
<v Speaker 1>versus output distinction in this debate. So you can either

0:28:24.680 --> 0:28:28.159
<v Speaker 1>have infringement based on the input that is, when the

0:28:28.280 --> 0:28:32.320
<v Speaker 1>LM digests, scans, whatever it does to get that information

0:28:32.440 --> 0:28:34.800
<v Speaker 1>into the system to train on that data, that can

0:28:34.880 --> 0:28:39.320
<v Speaker 1>potentially be an infringement of an author's rights. Also, potentially

0:28:39.400 --> 0:28:43.880
<v Speaker 1>the AI output could infringe There's also an entirely separate manner,

0:28:43.920 --> 0:28:48.120
<v Speaker 1>and that's whether ALLM can actually produce copyrightable content. But

0:28:48.280 --> 0:28:50.040
<v Speaker 1>we're going to leave that aside for now. That's why

0:28:50.160 --> 0:28:53.360
<v Speaker 1>the entirely separate conversation where we don't have necessarily litigation

0:28:53.800 --> 0:28:56.920
<v Speaker 1>to dive into that. But the input output, What are

0:28:56.920 --> 0:28:58.880
<v Speaker 1>your views on that and how that might play out?

0:28:59.120 --> 0:29:02.120
<v Speaker 2>Yeah, I mean it can play out in a few

0:29:02.120 --> 0:29:04.320
<v Speaker 2>different ways. So one, let's just talk about how the

0:29:04.360 --> 0:29:09.240
<v Speaker 2>output can play into the questions that really involve the input.

0:29:09.360 --> 0:29:11.440
<v Speaker 2>And you know, the easiest way to think about that

0:29:11.600 --> 0:29:15.000
<v Speaker 2>is in the cases that have been filed against the

0:29:15.640 --> 0:29:19.400
<v Speaker 2>against the LM's like open AI. There is an effort

0:29:19.560 --> 0:29:21.800
<v Speaker 2>by authors in some of those cases, including The New

0:29:21.880 --> 0:29:26.880
<v Speaker 2>York Times, to argue that our articles are being reproduced

0:29:26.960 --> 0:29:31.560
<v Speaker 2>almost verbatim when users ask certain queries. Right, so user

0:29:31.600 --> 0:29:33.560
<v Speaker 2>put an input, tell me what happened in this article,

0:29:34.000 --> 0:29:38.240
<v Speaker 2>and according to the New York Times complaint, CHATCHPT will

0:29:38.840 --> 0:29:42.560
<v Speaker 2>with certain prompting output material that the New York Times

0:29:42.640 --> 0:29:47.280
<v Speaker 2>argues is substantially similar to the articles. Now if that's true,

0:29:47.360 --> 0:29:49.880
<v Speaker 2>then it becomes a very much easier case for the

0:29:49.880 --> 0:29:51.880
<v Speaker 2>New York Times. Right then it's like, well, what's the

0:29:51.920 --> 0:29:55.880
<v Speaker 2>difference between, you know, that versus just a pirate website

0:29:55.880 --> 0:29:58.280
<v Speaker 2>that copies New York Times articles and makes them available

0:29:58.280 --> 0:29:59.760
<v Speaker 2>for free. If you're able to just go into this

0:29:59.840 --> 0:30:03.000
<v Speaker 2>end and reproduced articles in full, there's sort of little

0:30:03.040 --> 0:30:06.400
<v Speaker 2>doubt that that would be sort of an infringement. Open

0:30:06.440 --> 0:30:08.720
<v Speaker 2>Aye is really contesting that that's the way the tool

0:30:08.800 --> 0:30:11.840
<v Speaker 2>is supposed to be used, and arguing that, you know,

0:30:12.280 --> 0:30:14.600
<v Speaker 2>the only way that you were able to get chat

0:30:14.640 --> 0:30:18.760
<v Speaker 2>gpt to display those outputs was by basically taking the

0:30:18.800 --> 0:30:21.160
<v Speaker 2>model and beating it with a stick until it came

0:30:21.160 --> 0:30:23.680
<v Speaker 2>out with the articles that were verbatim copies. And so

0:30:24.080 --> 0:30:28.520
<v Speaker 2>it's much easier under copyright, much easier to argue that

0:30:28.800 --> 0:30:33.200
<v Speaker 2>the technology you've created is supplanting the original if the

0:30:33.240 --> 0:30:37.360
<v Speaker 2>outputs are generally substantially similar to the inputs, right, that's

0:30:37.360 --> 0:30:41.000
<v Speaker 2>sort of like pretty pretty intuitive. And so the authors

0:30:41.040 --> 0:30:43.320
<v Speaker 2>are frequently trying to focus on the outputs, and they've

0:30:43.320 --> 0:30:46.120
<v Speaker 2>tried different theories. Another theory that was tried and rejected

0:30:46.400 --> 0:30:50.520
<v Speaker 2>was in one of those cases involving images. The argument was,

0:30:50.920 --> 0:30:54.760
<v Speaker 2>because all of your images that you output from your

0:30:54.840 --> 0:31:00.240
<v Speaker 2>model are trained on our copyright protected images, every thing

0:31:00.360 --> 0:31:03.360
<v Speaker 2>that is output from your model is a derivative of

0:31:03.440 --> 0:31:06.600
<v Speaker 2>our inputs. Even if the output is not substantially similar

0:31:06.600 --> 0:31:11.000
<v Speaker 2>to any particular input, all of the outputs are derivative

0:31:11.040 --> 0:31:14.280
<v Speaker 2>works of the original. A novel argument, but was rejected

0:31:14.320 --> 0:31:17.520
<v Speaker 2>already on a motion to dismiss by the court, because

0:31:17.560 --> 0:31:19.520
<v Speaker 2>the law of a copyright is that in order for

0:31:19.560 --> 0:31:22.080
<v Speaker 2>something to be a derivative work, it has to be

0:31:22.120 --> 0:31:24.840
<v Speaker 2>substantially similar in some way to the original. And so

0:31:24.880 --> 0:31:27.480
<v Speaker 2>the court said, no, that's not the way derivative works go.

0:31:27.880 --> 0:31:31.239
<v Speaker 2>So in many of the cases, the plaintiffs and the

0:31:31.280 --> 0:31:36.320
<v Speaker 2>authors are having to focus primarily on the inputs and

0:31:36.360 --> 0:31:41.280
<v Speaker 2>to say that there's still infringement. Even if the outputs

0:31:41.320 --> 0:31:44.240
<v Speaker 2>are not substantially similar, they're infringing. Let me just say

0:31:44.320 --> 0:31:46.960
<v Speaker 2>one other category of why the outputs matter, which is

0:31:47.000 --> 0:31:50.200
<v Speaker 2>sort of a connected question, and one that the defendants

0:31:50.320 --> 0:31:53.880
<v Speaker 2>the models are happier to have better ground for them

0:31:53.880 --> 0:31:57.200
<v Speaker 2>to argue, which is this. Listen, if you're the user

0:31:57.240 --> 0:32:00.000
<v Speaker 2>and you're controlling this platform, and you create an output

0:32:00.280 --> 0:32:03.400
<v Speaker 2>that because of your prompting and because of your directions,

0:32:03.720 --> 0:32:06.680
<v Speaker 2>turns out to be infringing. And then, for example, you

0:32:06.800 --> 0:32:09.680
<v Speaker 2>make some commercial use of that image. Right, take you

0:32:09.720 --> 0:32:12.520
<v Speaker 2>create an image, you put it into an advertisement, and

0:32:12.560 --> 0:32:16.600
<v Speaker 2>you start selling shoes with this, you know, infringing image.

0:32:16.720 --> 0:32:19.680
<v Speaker 2>The models would like to say that's kind of on you,

0:32:20.080 --> 0:32:23.160
<v Speaker 2>and that you were the one who engaged in that,

0:32:23.480 --> 0:32:26.920
<v Speaker 2>and that the model had no volitional conduct. And part

0:32:26.960 --> 0:32:29.720
<v Speaker 2>of copyright is you kind of have to It's not willful,

0:32:29.720 --> 0:32:31.440
<v Speaker 2>it's not like a malevolent thing, but you at least

0:32:31.480 --> 0:32:34.640
<v Speaker 2>have to have some volitional intent, Like you know, if

0:32:34.640 --> 0:32:36.680
<v Speaker 2>you sneeze and you have an accident, often you can

0:32:36.720 --> 0:32:39.640
<v Speaker 2>get off the hook, like it's an involuntary movement. Right.

0:32:40.200 --> 0:32:43.280
<v Speaker 2>They kind of want to argue that if somebody is

0:32:43.400 --> 0:32:45.720
<v Speaker 2>using the tool in a way that it's not supposed

0:32:45.720 --> 0:32:48.360
<v Speaker 2>to be used, or create some kind of infringing content,

0:32:48.600 --> 0:32:51.280
<v Speaker 2>then who should be responsible. Should it be the user

0:32:51.440 --> 0:32:53.440
<v Speaker 2>of the platform or should it be the model. And

0:32:53.480 --> 0:32:56.320
<v Speaker 2>that's the other question that sort of hasn't been answered

0:32:56.360 --> 0:32:57.080
<v Speaker 2>yet by courts.

0:32:57.240 --> 0:33:00.000
<v Speaker 1>So it sounds like potentially there's some contributory issue.

0:33:00.480 --> 0:33:01.920
<v Speaker 2>That's that's right, Well, that's what the that's what the

0:33:01.920 --> 0:33:02.880
<v Speaker 2>place just would argue that's.

0:33:02.840 --> 0:33:04.720
<v Speaker 1>Yeah, yeah, yeah, so that's uh that that could get

0:33:04.760 --> 0:33:09.200
<v Speaker 1>even more tricky to wagh through. So when you're you're

0:33:09.200 --> 0:33:11.840
<v Speaker 1>talking about how the New York Times, I think that's right.

0:33:11.880 --> 0:33:14.680
<v Speaker 1>In their complaint, they're like, these are regurgitation of our articles,

0:33:14.920 --> 0:33:17.280
<v Speaker 1>I think, and some of the author's complaints potentially of

0:33:17.320 --> 0:33:20.360
<v Speaker 1>the author's guild or maybe the original Silverman complaint also

0:33:20.440 --> 0:33:22.960
<v Speaker 1>had that, And it strikes me as also interesting because

0:33:23.000 --> 0:33:26.000
<v Speaker 1>we also many have divergence there because a news article

0:33:26.080 --> 0:33:30.000
<v Speaker 1>that is largely a regurgitation of potentially a historical event

0:33:30.440 --> 0:33:33.320
<v Speaker 1>might have less copyright protection under sort of the second

0:33:33.360 --> 0:33:36.400
<v Speaker 1>fair use factor than potentially a novel. So so it's

0:33:36.680 --> 0:33:39.760
<v Speaker 1>there's so many different directions that this could evolve that, Yeah,

0:33:39.760 --> 0:33:41.240
<v Speaker 1>it's going to be really interesting to see how this

0:33:41.280 --> 0:33:43.280
<v Speaker 1>plays out over the probably in the next two years

0:33:43.360 --> 0:33:47.000
<v Speaker 1>or so. Just really quickly, I want to touch on licensing.

0:33:47.320 --> 0:33:50.680
<v Speaker 1>You sort of mentioned before that a potential argument will

0:33:50.680 --> 0:33:54.000
<v Speaker 1>be that we can't license the world, but they certainly

0:33:54.000 --> 0:33:59.200
<v Speaker 1>are licensing a lot. We've seen so many large licensing agreements.

0:33:59.280 --> 0:34:02.160
<v Speaker 1>I think the largest ones I can remember is Google

0:34:02.200 --> 0:34:05.040
<v Speaker 1>with Reddit for I think something like sixty million dollars

0:34:05.080 --> 0:34:07.320
<v Speaker 1>a year. New York Times, of course, is sued. A

0:34:07.320 --> 0:34:10.560
<v Speaker 1>lot of rather newspaper publishers have sued, but a lot

0:34:10.640 --> 0:34:14.040
<v Speaker 1>of them have also just agreed to licensing agreements with

0:34:14.400 --> 0:34:17.040
<v Speaker 1>whether it's Open Ai, whether it's Google, whether it's some

0:34:17.080 --> 0:34:20.720
<v Speaker 1>of these other lms. So how does that potentially cut

0:34:20.800 --> 0:34:25.600
<v Speaker 1>against ALM in fair use if they can go out

0:34:25.640 --> 0:34:28.160
<v Speaker 1>and license a lot of this content, or does it

0:34:28.200 --> 0:34:31.399
<v Speaker 1>maybe not impact it? Or again who knows?

0:34:31.680 --> 0:34:36.799
<v Speaker 2>Yeah, I mean so one of the challenges in just

0:34:36.840 --> 0:34:39.120
<v Speaker 2>going back to the Google Books case that we talked

0:34:39.120 --> 0:34:42.120
<v Speaker 2>about in the beginning, one of the challenges in that

0:34:42.239 --> 0:34:46.839
<v Speaker 2>case for the authors was proving that there was a

0:34:47.080 --> 0:34:50.359
<v Speaker 2>licensing market for the uses that were being made by

0:34:50.360 --> 0:34:53.080
<v Speaker 2>Google Books, and that was a challenge, and the court

0:34:53.080 --> 0:34:56.000
<v Speaker 2>picked up on that challenge. And that fourth factor, which

0:34:56.040 --> 0:34:58.680
<v Speaker 2>is like the harm to the market for the copyright

0:34:58.880 --> 0:35:02.080
<v Speaker 2>work the court found in that case was that there

0:35:02.160 --> 0:35:06.279
<v Speaker 2>was no market for licensing books for the purpose of

0:35:06.360 --> 0:35:09.720
<v Speaker 2>creating a search index. No one ever had paid any author.

0:35:09.960 --> 0:35:11.759
<v Speaker 2>I want to use your book so that I can

0:35:11.840 --> 0:35:14.719
<v Speaker 2>make it searchable on the Internet. Without displaying the book,

0:35:14.719 --> 0:35:16.160
<v Speaker 2>people won't be able to read the book you'll just

0:35:16.200 --> 0:35:19.239
<v Speaker 2>be able to find where the information lives inside of

0:35:19.239 --> 0:35:21.319
<v Speaker 2>a book. No one had ever paid for that, and

0:35:21.440 --> 0:35:25.080
<v Speaker 2>so the court had difficulties seeing how that was harming

0:35:25.400 --> 0:35:30.520
<v Speaker 2>the economic interests of the author's copyright. Okay, contrast that

0:35:30.600 --> 0:35:33.319
<v Speaker 2>with what you just talked about, where you have a

0:35:33.360 --> 0:35:38.319
<v Speaker 2>fledgling licensing market for using copyright protected works to help

0:35:38.400 --> 0:35:41.759
<v Speaker 2>train AIS. And when you have the AI companies that

0:35:41.800 --> 0:35:43.880
<v Speaker 2>are trying to defend themselves with fair use, are they

0:35:43.880 --> 0:35:46.640
<v Speaker 2>cutting their nose to spite their face by entering into

0:35:46.640 --> 0:35:49.400
<v Speaker 2>these agreements and creating a market that didn't really exist before.

0:35:49.640 --> 0:35:53.600
<v Speaker 2>You know, that's a new novel market that now exists.

0:35:53.760 --> 0:35:58.400
<v Speaker 2>And now the plaintiffs in those cases have fodder to

0:35:58.800 --> 0:36:01.879
<v Speaker 2>argue that there's a market harm and it's a there's

0:36:01.880 --> 0:36:07.920
<v Speaker 2>a cognizable developing and you know, evidenced licensing market. The

0:36:08.040 --> 0:36:13.360
<v Speaker 2>question is whether that will really undercut the argument that

0:36:13.400 --> 0:36:15.000
<v Speaker 2>I was talking about you just raise, which is like

0:36:15.000 --> 0:36:17.840
<v Speaker 2>we need to digitize the entire world, because yeah, so

0:36:17.880 --> 0:36:20.840
<v Speaker 2>that's number one, which is just because we can license

0:36:20.880 --> 0:36:24.439
<v Speaker 2>some of the materials from certain players, that's a far

0:36:24.560 --> 0:36:28.359
<v Speaker 2>cry from the entirety of the world's knowledge, right, And

0:36:28.440 --> 0:36:31.160
<v Speaker 2>so and that's not it. We still couldn't license nearly

0:36:31.280 --> 0:36:34.080
<v Speaker 2>enough data to train all these models to the point

0:36:34.080 --> 0:36:37.040
<v Speaker 2>where they are. And you have new players in the

0:36:37.040 --> 0:36:40.319
<v Speaker 2>market like deep seek, which had originally said something like

0:36:40.520 --> 0:36:42.319
<v Speaker 2>we don't need to use that much training data, but

0:36:42.360 --> 0:36:44.520
<v Speaker 2>it was kind of an artificial argument because they needed

0:36:44.560 --> 0:36:47.400
<v Speaker 2>to do that based on the models that had already

0:36:47.400 --> 0:36:51.200
<v Speaker 2>been developed using the entire you know, these extraordinarily massive

0:36:51.200 --> 0:36:55.919
<v Speaker 2>amounts of training data. So there's that. The other important

0:36:56.280 --> 0:36:59.400
<v Speaker 2>argument that I imagine will come out is by the

0:36:59.560 --> 0:37:03.400
<v Speaker 2>by the AI platforms, is you're comparing apples and oranges,

0:37:03.680 --> 0:37:07.400
<v Speaker 2>that this is not getting the entire corpus of the

0:37:07.440 --> 0:37:10.799
<v Speaker 2>Internet and sucking it into our model. When we enter

0:37:10.880 --> 0:37:14.920
<v Speaker 2>into these license agreements, in many cases, what we're paying

0:37:14.960 --> 0:37:18.320
<v Speaker 2>for is not just the just the underlying like raw data.

0:37:18.480 --> 0:37:20.879
<v Speaker 2>We're paying for other things that are more valuable to us.

0:37:20.880 --> 0:37:25.360
<v Speaker 2>For example, you're giving us access to archived materials that

0:37:25.400 --> 0:37:28.440
<v Speaker 2>are not openly available online. You're giving us access to

0:37:28.480 --> 0:37:32.000
<v Speaker 2>materials that are behind a paywall. You're also allowing us

0:37:32.040 --> 0:37:36.239
<v Speaker 2>in many instances to display the outputs or like you know,

0:37:36.239 --> 0:37:38.960
<v Speaker 2>some of these agreements that are out there and I've

0:37:38.960 --> 0:37:40.560
<v Speaker 2>worked on many of them, I see many of them.

0:37:40.800 --> 0:37:44.360
<v Speaker 2>They say things like, you can display x you know,

0:37:44.920 --> 0:37:47.360
<v Speaker 2>one hundred or tens or whatever it is, X number

0:37:47.400 --> 0:37:50.319
<v Speaker 2>of words from the article or from the original book

0:37:50.400 --> 0:37:53.360
<v Speaker 2>or whatever it is, and display that to users in

0:37:53.400 --> 0:37:56.359
<v Speaker 2>response to searches, provided that there's a link back, et cetera.

0:37:56.480 --> 0:37:59.480
<v Speaker 2>Sometimes a link, sometimes not. And those are rights and

0:37:59.600 --> 0:38:02.879
<v Speaker 2>privilege that are not part of what the AIS are doing.

0:38:02.880 --> 0:38:04.880
<v Speaker 2>So they would say, you know, it's not really fair

0:38:05.280 --> 0:38:07.600
<v Speaker 2>to look to those license markets and look to those

0:38:07.680 --> 0:38:10.640
<v Speaker 2>you know, agreements as evidence that there's a licensing market,

0:38:10.680 --> 0:38:13.759
<v Speaker 2>because this is a totally different sort of arrangement and

0:38:13.840 --> 0:38:17.239
<v Speaker 2>that's what we're paying for. Again, I'm not calling balls

0:38:17.239 --> 0:38:20.040
<v Speaker 2>and strikes here. I'm just saying that these are some

0:38:20.160 --> 0:38:22.759
<v Speaker 2>of the arguments that I think will be made and

0:38:22.800 --> 0:38:25.120
<v Speaker 2>some of the distinctions that will attempted to be drawn,

0:38:25.239 --> 0:38:27.640
<v Speaker 2>and that will you know, could could sway the court

0:38:27.680 --> 0:38:28.680
<v Speaker 2>wumber or the other. Yeah.

0:38:28.680 --> 0:38:31.880
<v Speaker 1>Absolutely, And I think it also sort of shows the

0:38:31.920 --> 0:38:36.280
<v Speaker 1>appetite to train these models on things like news content,

0:38:36.520 --> 0:38:39.319
<v Speaker 1>on things like novels. I think I was reading some

0:38:39.440 --> 0:38:42.560
<v Speaker 1>executive some own companies like there's no way better way

0:38:42.640 --> 0:38:46.120
<v Speaker 1>to train an l m than than on novels, seeing

0:38:46.120 --> 0:38:50.040
<v Speaker 1>how people speak, you know, it's just sort of some

0:38:50.120 --> 0:38:52.200
<v Speaker 1>of the the the best data that they can get

0:38:52.200 --> 0:38:53.920
<v Speaker 1>to train these models on. And I think it'll be

0:38:53.960 --> 0:38:57.720
<v Speaker 1>interesting to see see. I do think these cases shape UPO.

0:38:58.239 --> 0:39:01.080
<v Speaker 2>I do think what's what's what's encouraging, because look, I'm

0:39:01.120 --> 0:39:04.880
<v Speaker 2>somebody who it's certainly in this context I can see

0:39:05.080 --> 0:39:07.960
<v Speaker 2>not to see both sides, but I certainly feel an

0:39:07.960 --> 0:39:11.319
<v Speaker 2>affinity towards authors and artists and creators right. I think

0:39:11.320 --> 0:39:15.160
<v Speaker 2>that copyright does exist to support their work and their effort,

0:39:15.280 --> 0:39:17.560
<v Speaker 2>and they should be paid for their work and their effort.

0:39:17.719 --> 0:39:19.640
<v Speaker 2>I think that there's a lot of doom and gloom

0:39:19.760 --> 0:39:24.120
<v Speaker 2>and that's also appropriate and understandable and fear around what's

0:39:24.160 --> 0:39:27.239
<v Speaker 2>going to happen with these models, and totally totally get it.

0:39:27.520 --> 0:39:31.600
<v Speaker 2>What's also nice to see is I've seen authors get paid, like,

0:39:31.680 --> 0:39:33.480
<v Speaker 2>you know, all of a sudden, get a check for

0:39:33.640 --> 0:39:36.319
<v Speaker 2>like twenty five hundred bucks, you know, and of a

0:39:36.360 --> 0:39:38.480
<v Speaker 2>book that hasn't sold in many, many years, and all

0:39:38.480 --> 0:39:40.720
<v Speaker 2>of a sudden, like some publishers are entering into deals

0:39:40.760 --> 0:39:42.239
<v Speaker 2>and like authors are getting a check for a few

0:39:42.239 --> 0:39:44.800
<v Speaker 2>thousand bucks for a work because they've entered into a

0:39:44.840 --> 0:39:47.240
<v Speaker 2>deal to you know, use it to train an AI

0:39:47.600 --> 0:39:49.319
<v Speaker 2>or you know, make some kind of use in it.

0:39:49.440 --> 0:39:52.400
<v Speaker 2>So that to me is also encouraging and so sort

0:39:52.440 --> 0:39:54.800
<v Speaker 2>of from a business standpoint, you really see the development

0:39:54.840 --> 0:39:56.760
<v Speaker 2>of these new markets, and that to me is exciting

0:39:56.960 --> 0:39:59.279
<v Speaker 2>and encouraging. Although again I know that there's a lot

0:39:59.280 --> 0:40:03.960
<v Speaker 2>of anger and fear and you know, litigation over the

0:40:04.000 --> 0:40:05.919
<v Speaker 2>broader issues, but it is nice to see that there's

0:40:06.160 --> 0:40:09.040
<v Speaker 2>it's nice to see authors getting paid, including in connection

0:40:09.120 --> 0:40:10.680
<v Speaker 2>with these new markets. Yeah.

0:40:10.719 --> 0:40:13.880
<v Speaker 1>Sure, And I think overall I look at the markets,

0:40:14.120 --> 0:40:16.680
<v Speaker 1>and my feeling has been that I think the markets

0:40:16.920 --> 0:40:19.520
<v Speaker 1>or to some extent underestimating just how much of an

0:40:19.560 --> 0:40:23.200
<v Speaker 1>overhang this could potentially be for large language models. And

0:40:23.320 --> 0:40:26.040
<v Speaker 1>I think, you know, our discussion today sort of touched

0:40:26.080 --> 0:40:28.839
<v Speaker 1>on just how tricky this situation is, how fair use

0:40:28.880 --> 0:40:31.360
<v Speaker 1>can go one way and in another way one court,

0:40:31.400 --> 0:40:33.960
<v Speaker 1>it may certainly be fair use, another court maybe not.

0:40:34.120 --> 0:40:37.160
<v Speaker 1>Distinction is drawn. I think we're going to see some

0:40:37.200 --> 0:40:40.360
<v Speaker 1>more summary judgment decisions ripen in twenty twenty five. I

0:40:40.400 --> 0:40:42.960
<v Speaker 1>believe one of the meta cases, I believe they're breathing

0:40:43.000 --> 0:40:45.640
<v Speaker 1>that earlier part of this year, and I think there's

0:40:45.640 --> 0:40:48.560
<v Speaker 1>a case with judge alsop out in California with Anthropic

0:40:48.600 --> 0:40:51.720
<v Speaker 1>where I think they're scheduled to do both summary judgment,

0:40:51.760 --> 0:40:54.200
<v Speaker 1>fair use and sort of class certification later in the year.

0:40:54.239 --> 0:40:56.719
<v Speaker 1>So I think as possible we start to get some

0:40:56.760 --> 0:40:59.759
<v Speaker 1>clarity on this in twenty twenty five, maybe moving into

0:40:59.800 --> 0:41:04.320
<v Speaker 1>twenty six, but it's gonna be an absolutely fascinating area

0:41:04.400 --> 0:41:07.160
<v Speaker 1>to watch, I think, over the next few years. So

0:41:07.560 --> 0:41:09.399
<v Speaker 1>I think we're going to leave it there. I think

0:41:09.400 --> 0:41:11.920
<v Speaker 1>we've covered a lot of ground. Jeremy, it really appreciate

0:41:11.920 --> 0:41:14.080
<v Speaker 1>you coming on today and all of your your really

0:41:14.120 --> 0:41:17.120
<v Speaker 1>invaluable insights on this, I think, and it is really

0:41:17.120 --> 0:41:20.560
<v Speaker 1>good to hear from from a pretictioner's perspective. I'm not

0:41:20.600 --> 0:41:24.600
<v Speaker 1>wrong that these are challenging issues. It's it's a yeah,

0:41:24.719 --> 0:41:27.320
<v Speaker 1>it's gonna be very interesting to see how this plays out. Jeremy,

0:41:27.320 --> 0:41:27.799
<v Speaker 1>thanks so much.

0:41:28.160 --> 0:41:29.440
<v Speaker 2>Thanks to one anytime,