RIAA Of Six Years Ago Debunks RIAA Of Today’s AI Lawsuit Claims

from the insert-spiderman-v.-spiderman-meme dept

There have been a bunch of lawsuits over the last couple of years from traditional content industries suing AI providers, claiming copyright infringement. We’re still a long way from figuring out how all of these lawsuits will shake out. We’ve made it clear that we’re skeptical of these lawsuits, largely because you would have to basically ignore a bunch of important and useful copyright precedents to reach the conclusion that training on copyright-covered works infringes on copyright.

Of course, this is copyright, where logic and precedent are often ignored based on who a judge hates more. So, we shall see. But, we’ve definitely seen a lot of people cheering on these lawsuits, mainly in the false belief that it’s about “artists” vs. “big evil tech companies” and therefore the “artists” should win.

Reality is always a lot more complex and nuanced. If these lawsuits succeed, it will not help artists get paid. Instead, it will again increase the reliance on middlemen who have a long history of screwing over the artists. Just the fact that the RIAA is currently run by a guy who famously got his job at the RIAA just months after sneaking language into a bill to fuck over musicians should tell you all you need to know about the RIAA’s actual interests.

Also, if the cases decide that training is a licensable scenario, it will kill smaller and open source AIs and make it so only the largest of the largest tech companies can create LLMs. So instead of being a victory over “big tech,” it will hand the market to big tech.

And that’s not even getting into the damage it would do to the ability to read the open internet (which itself could be judged a licensable event) or the ability of researchers to scan and collect data about the open internet.

Just be careful what you wish for.

Earlier this week, the RIAA gleefully announced that it was suing two of the bigger music generator AI services. It filed one lawsuit against Suno in Massachusetts and another against Udio in New York.

Both lawsuits are effectively the same. And, they’re both ridiculously weak. They are both based on the premise that training on copyright-covered works requires permission. But, again, we’ve been there and done that. Training is a form of scanning or reading, and that’s either not a copyright-triggering event at all, or it’s fair use.

The lawsuits do not name what copyright covered content was actually copied beyond some handwaving about “all of it.” This is not sufficient for a copyright claim. The lawsuits argue that because it can tell these apps to make songs like musicians on RIAA member labels, that proves it’s infringing. From the Suno complaint:

Plaintiffs could have proceeded with this action based solely on eliciting that reasonable inference of copying. Nevertheless, Plaintiffs’ claims are based on much more. In particular, Plaintiffs tested Suno’s product and generated outputs using a series of prompts that pinpoint a particular sound recording by referencing specific subject matter, genre, artist, instruments, vocal style, and the like. Suno’s service repeatedly generated outputs that closely matched the targeted copyrighted sound recording, which means that Suno copied those copyrighted sound recordings to include in its training data. In addition, the public has observed (and Plaintiffs have confirmed) that even less targeted prompts can cause Suno’s product to generate outputs that resemble specific recording artists and specific copyrighted recordings. Such outputs are clear evidence that Suno trained its model on Plaintiffs’ copyrighted sound recordings.

Which… doesn’t matter? Again, training is clearly fair use, and “specific subject matter, genre, artist, instruments, vocal styles, and the like are not copyright-covered expression. All of those things are not elements subject to copyright.

If you want proof of that, just look at what the RIAA itself has said in cases a few years ago, following the Blurred Lines decision that initially suggested that music “styles” should be covered by copyright. The RIAA realized, quite quickly, that this might make a huge portion of the labels’ catalogues infringing and freaked out. In one case, the RIAA filed an amicus brief noting that such overprotection would be incredibly damaging.

… new songs incorporating new artistic expression influenced by unprotected, pre-existing thematic ideas must also be allowed.

That’s the RIAA’s own argument just six years ago. And now they’re arguing that such unprotected thematic ideas are protected. But only when tech companies are making use of them apparently.

Again, in that brief, the RIAA cogently argues against what the RIAA is now arguing in these complaints:

Most compositions share some elements with past compositions—sequences of three notes, motifs, standard rhythmic passages, arpeggios, chromatic scales, and the like. Likewise, all compositions share some elements of “selection and arrangement” defined in a broad sense. The universe of notes and scales is sharply limited. Nearly every time a composer chooses to include a sequence of a few notes, an arpeggio, or a chromatic scale in a composition, some other composer will have most likely “selected” the same elements at some level of generality.

To keep every work from infringing — and to keep authors from being able to claim ownership of otherwise unprotected elements — this Court has stressed that selection and arrangement is infringed only when there is virtual identity between two works, not loose resemblance. The same principle should be recognized for music.

Um. So, considering that the complaints are not showing “virtual identity between two works” then the RIAA itself has made the case for why these models are not infringing.

In that same brief, the RIAA itself admits that there can be only “thin” copyright coverage on general themes at most, to avoid making music inspired by others to be infringing:

To prevent nearly every new composition being at risk for liability, copyright claims based on “original contributions to ideas already in the public domain,” Satava v. Lowry, 323 F.3d 805 (9th Cir. 2003), are seen as involving a “thin copyright that protects against only virtually identical copying.” Id. at 812; see also Ets-Hokin v. Skyy Spirits, Inc., 323 F.3d 763, 766 (9th Cir. 2003) (“When we apply the limiting doctrines, subtracting the unoriginal elements, Ets-Hokin is left with . . . a ‘thin’ copyright, which protects against only virtually identical copying.”); Rentmeester v. Nike, Inc., 883 F.3d 1111, 1128-29 (9th Cir. 2018). This Court has long recognized this principle in claims involving visual art that allegedly creatively combines public domain elements, as with the sculptures in Satava or the photographs in Ets-Hokin and Rentmeester. The same should apply to music.

Perhaps Suno and Udio should take a page from the RIAA’s own legal arguments in responding to these complaints against them.

I am sure RIAA folks (and anti-AI folks) will rush in to explain why “this is different,” but it’s not. It’s literally the same argument. Does copyright actually protect genre, themes, and the like? Of course not. It would be a ridiculous and dangerous outcome should that come to pass.

Now, I know the RIAA will claim that it’s not suing over the output of these tools, but rather just pointing to those things as proof of infringement on the training side. But, again, training by scanning copyright-covered material for a totally transformative use (which includes learning from or being inspired by) is quintessential fair use.

The training is fair use. The fact that it can output songs with a similar theme matters not one bit to the copyright question, as the RIAA itself admits.

Of course, this case will go on for years and years. You can never predict how courts will rule on copyright issues, but these two cases seem particularly weak and silly. This is especially true given how it shows the RIAA going back on its own previous claims from just a few years ago.

And, just to close out this piece, I’ll note that RIAA CEO Mitch Glazer, again, the very guy who snuck words into a totally unrelated bill to literally take copyrights away from artists and hand them to music labels, is quoted in the press release about this lawsuit talking about how it’s not fair to “exploit” an “artist’s life’s work” for profit, even though that’s exactly what all of his member labels have done for nearly all of their existence.

Filed Under: , , , , ,
Companies: riaa, suno, udio, universal music group

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “RIAA Of Six Years Ago Debunks RIAA Of Today’s AI Lawsuit Claims”

Subscribe: RSS Leave a comment
70 Comments
Anonymous Coward says:

Not that complex

From a copyright point of view, a LLM is really just a large lossy archive format. So the RIAA, as evil as they are, is not that far off.

All the RIAA should have to show that a prompt that is not basically the entire work causes a covered work or something substantially similar, to be produced, and they have to show that the work in question was in the LLM training data.

Being parasites on the economy, the RIAA seems to think that just alleging that maybe they could show the above is enough.

Of course I doubt the courts will come to a reasonable conclusion because computers.

BJC (profile) says:

Re: Re: LLMs are arguably a form of data compression

LLMs are not an intentional lossy storage format, but people keep writing papers equating them with compression:
* “Language Modeling Is Compression” https://arxiv.org/abs/2309.10668
* “Compression Represents Intelligence Linearly” https://arxiv.org/abs/2404.09937 (“Recently, language modeling has been shown to be equivalent to compression”)
* And this paper using Gzip as an LLM: https://aclanthology.org/2023.findings-acl.426.pdf

So, if you’re going to argue that LLMs shouldn’t be equated to compression formats, it’s that the legal analogy — the facts to the law — is inapt, not that it’s factually wrong to liken a language model to a compression format.

Rocky says:

Re: Re: Re:

You are equating two different things IMHO, a lossy compression format isn’t the same as a lossy storage format and I have a hard time imagining what a lossy storage format actually is except a litany of read errors.

I guess I find the used terms to be very sloppy in how they describe how an LLM process and stores data since, in simple terms, it simply generates, encodes and then stores statistical data about content.

Nafnlaus says:

Re: Re: Re: Re:

This is a misunderstanding. Yes, they are “compression” to the degree that anything that holds any information is “compression”, including the human brain. What they aren’t is a compressed replica of originals.

When you “memorize”, say, what a house looks like, you’re only capturing a tiny fraction of the perceptible details of that house – but through your imagination you can fill in the gaps, even from the most vague of recollections.

When it comes to compression of data, some percentage of any raw data is imperceptible or barely perceptible to humans. Compressed file formats compress data by throwing this out. Maybe you throw out 5-to-1, 10-to-1, even more. But what happens when you you throw out more? Now you’re not just throwing out imperceptible details, but perceptible ones, and with enough compression, you’re throwing out basically the entire structure of the work.

22 million new songs per year are uploaded to Spotify. The raw data size of music represented by producers is surely at least 1e16 bytes (10 exabytes), and probably about an order of magnitude higher than that (1e17 / 100 exabytes). The model weights on a service like this by contrast might be more like 10GB. About 7 orders of magnitude difference. For every 10MB of music, the models contain 1 byte. In case the point isn’t clear: the models are not storing the specific details of songs. They’re just capturing generalities about the essence of music (rhythm, pacing, tone, transitions, sounds of instruments, etc). All of the actual details are “imagined in”, because said details were thrown away during the “compression”.

Both biological brains and neural networks operate in latent spaces. People talk about latents as a “computery” concept, a vector of floats, but latents are just a freezeframe of a deep layer of activations that have been pinched down from a much higher resolution input field. In biological brains, a latent space can be represented as the firing rates of a cluster of neurons that acts as a bottleneck. Neurons leading up to a latent (biological or ANN) can be classified as a compressor, and neurons leading away from a latent can be classified as a decompressor. The decompressor “imagines in” the details that don’t exist in the latent because they were thrown away in the compression process.

You can provide any latent to the decompressor. Even entirely random values, werein the decompressor will imagine in a random scene. The key is that latent spaces are always coherent, because the compression process has thrown out that which is incoherent. Latent spaces are interesting because you can interpolate lineraly between any two concepts (with coherent answers along the entire interpolation), and do mathematical operations on concepts themselves (for example, “king – man + woman ~= queen”).

But to reiterate: the models are NOT storing music. They’ve “compressed” the music dataspace down to general concepts about music.**

** One does have to add in a caveat about overtraining. If a model keeps encountering the same thing over and over, it’s going to increasingly store more information about that thing (at the cost of storing more generalities about everything else). This can be desirable – for example, if an image model keeps encountering the American flag, you do want it to learn the precise shape of the American flag. But apart from common motifs that should be learned, you don’t want the model wasting its limited weights on memorizing specific aspects to specific works. So proper deduplication is important.

Anonymous Coward says:

Perhaps Suno and Udio should take a page from the RIAA’s own legal arguments in responding to these complaints against them.

Perhaps one better: Rather than simply re-issuing the RIAA’s arguments, perhaps Suno and Udio should ask for Judicial Notice of the RIAA’s previous positions, especially if they made such arguments in court (rather than to empty air/The Media/The Internet).

Crafty Coyote says:

For those of a certain age who can get the reference, the copyright industry is to the “progress of arts and sciences” what Alfred Bester and the Psi Corps was to the poor telepaths who had no choice but to trust in them.

If there were rogue artists who wanted to get away from copyright, you can bet that the legal goons the RIAA employs would come after them and have them sent to jail.

This comment has been flagged by the community. Click here to show it.

terop (profile) says:

establishing the fact that AI processes violate copyright is very simple if you know a concept called “Effort calculation”. Cheap products shouldnt be created with techniques that are very expensive to reproduce. Reproducibility is one of the most important features of all technology development. When cars replaced horses, it meant that humans got a requirement to keep car factories and mining for metals up and running for next 2 million years. Same applies to your copyrighted works. If your book depends on wikipedia’s expensive-to-recreate library of user written texts, it is less valuable than a work that builds the same material from scratch. The deep dependencies cannot be maintained endlessly.

Wikipedia and user-generated content is our times examples of what copyright laws were designed to prevent. the fact that some cutting edge technology cannot be reproduced in the future even though millions of lives depend on its existence is failures that our generation will need to solve eventually.

MrWilson (profile) says:

Re:

establishing the fact that AI processes violate copyright is very simple if you know a concept [that isn’t a part of copyright law].

Establishing the fact that your perspective isn’t based on copyright law is very simple if you understand copyright law.

Cheap products shouldnt be created with techniques that are very expensive to reproduce.

Sure, nobody is allowed to record beautiful music on expensive musical equipment because this rando on the internet thought project management concepts should be shoehorned into copyright law.

Wikipedia and user-generated content is our times examples of what copyright laws were designed to prevent.

No, not at all.

the fact that some cutting edge technology cannot be reproduced in the future even though millions of lives depend on its existence is failures that our generation will need to solve eventually.

What the hell is this a reference to?

This comment has been flagged by the community. Click here to show it.

MrWilson (profile) says:

Re: Re: Re:

Right here:

a concept [that isn’t a part of copyright law].

this rando on the internet thought project management concepts should be shoehorned into copyright law.

Effort calculation isn’t an aspect of copyright law. You’re making a (stupid) moral argument based on your own arbitrary preference and pretending it’s already written into the law.

You would have been more accurate if you had said, “establishing the fact that AI processes violate copyright is very simple…if you just make up fake copyright law, don’t care about facts, and believe your own bullshit.”

This comment has been flagged by the community. Click here to show it.

terop (profile) says:

Re: Re:

his rando on the internet thought project management concepts should be shoehorned into copyright law.

I think this shows the craziness of the pirates and copyright minimalists. They think that copyright law stands alone in the world and something like project management concepts are incompatible with the law. This kind of thinking is completely bogus, the law needs to work with all (legal) activity that you can do to a copyrighted work, including project management, specification, programming, testing, releasing, publishing, error correction, bugfixing, firefighting, copying (by copyright owner), customer response creation, money collection, and tons of other activities like book burning…

trying to frame copyright law in such light that its incompatible with these activities is very odd position.

MrWilson (profile) says:

Re: Re: Re:

hey think that copyright law stands alone in the world and something like project management concepts are incompatible with the law.

trying to frame copyright law in such light that its incompatible with these activities is very odd position.

This is a straw man that shows you not only don’t understand copyright law, but you also didn’t understand what I wrote.

It’s not about compatibility or incompabibility. It’s about not being a part of the actual law. You made up an arbitrary standard that is not a part of the actual written law to argue that conduct is against the law. You don’t get to just make up new aspects of law just because you feel they should be included.

It would be fine if you had said that it should be part of the law if you actually acknowledged that it wasn’t currently a part of the law. But actually claiming that “establishing the fact that AI processes violate copyright is very simple if you know a concept called “Effort calculation”” is just untrue bullshit. It doesn’t violate copyright law and effort calculation is not a part of the law!

the law needs to work with all (legal) activity that you can do to a copyrighted work

Copyright law and related case law already covers these scenarios. Again, you’re demonstrating you don’t understand the law.

terop (profile) says:

Re: Re:

why do you hate learning?

It just fills your head with useless bullshit.
Enough is enough. Any activity that you practice too often is not useful any longer, recardless of how big advantage it has over other activities at the beginning.

Is it because you hate that fact that no one has monopolized it yet?

Shaking the status quo is the key.

Anonymous Coward says:

Re: Re: Re:

Shaking the status quo is the key.

And considering that Meshpage continues to fail to break into the market of web and game design, it’s proof that you’ve done a terrible job of it.

Neither is the government of Finland publicly executing or organ harvesting citizens based on accusations of copyright infringement.

The truth is that you’ve peaked by making a handful of games for consoles that aren’t even dominant in the market right now, and you decided that the best option was to take grievous personal offense at the fact that your name hasn’t been made into a state religion yet. Tough. Not even copyright law is equipped to help you out there.

Anonymous Coward says:

Re: Re: Re:3

If you can’t get a trailer done on time, a trailer that you’ve put so much weight and emphasis on as key to your imaginary success… that’s really no skin off my nose.

It’s not anyone’s moral obligation to excuse you for your failures, or compensate you because you think you’re not successful enough.

Not to mention you already have a trailer, and its quality makes even the most basic engine proof from video game programming university look like a triple A product.

Anonymous Coward says:

Re: Re: Re:

I’d ssuggest if you have such difficulty understanding metaphor, you should avoid online discussion in english. American speakers use metaphor extensively alongside idioms, and arguements will rapidly become nonsense for you.

Having multiple corporations with massive budgets fighting it out might stop them all. It might not. But the battlefield they play in impacts us. We are left with the legal precidents and impacted services and economic volitility. Its a pretty simple metaphor, frankly.

Anonymous Coward says:

Re: Re: Re:2

I’d ssuggest if you have such difficulty understanding metaphor, you should avoid online discussion in english.

If you’re going to be condescending, don’t be guilty of simple mistakes. If you’re going to criticize someone’s linguistic skills, don’t misspell “ssuggest” or fail to capitalize a proper noun like “english” or misspell “precident.”

I am an American speaker who uses metaphors a lot and I was criticizing your bad use of a metaphor. The idea of allowing two negative forces to fight each other can be valid and collateral damage may be unavoidable (and not even up to you). And the alternative of trying to fight both forces is likely more difficult.

terop (profile) says:

Re: Re: Re:5

the wealthy and the corporations are the true pirates.

no. The corporation stopped using my publish-bit when they stopped paying me.

They take from the actual creators and exploit and violate copyrights with near-perfect immunity.

This is called “customers” in ordinary business.

Copyright law as it exists steals from artists and the public.

No, it just limits how far toward the world your work can spread. You wouldn’t let 15 year old school children to reconfigure the nuclear power plant next door, even if those teenagers had powerful ideas about how to shut down the facility for green hippy ideals — let the customers (of electricity) suffer.

Rich Kulawiec says:

I'm not sure that existing IP thinking will handle this

By “handle”, I mean “provide us with a framework to think about about this issue and perhaps legally codify it”. Here’s what I mean, and let’s just stick to music for the moment.

Many musical works quote other works (or these days: sample them) and for the most part we seem to recognize that these are ways of creating a new work with a snippet of an old one. The quoted part isn’t the main body of the work — it’s just a few notes or a few seconds, and while editing it out would certainly change the new work, this would leave it largely intact and able to stand on its one as a new composition.

The new works being generated by AI (after having ingested a corpus of existing music) aren’t like this: they’re musical autocomplete on steroids. They’re algorithms, not composers, and they’re capable of churning out an endless parade of songs AND doing it quickly…something a human couldn’t do.

So what happens when someone trains an AI with the entire recorded catalog of, let’s say, the Foo Fighters, and then tells it to write 100,000,000 Foo Fighter-esque songs, and then copyrights every single one of them? By “what happens” I mean “what do the actual real live Foo Fighters do?” because given a library of 100M FF-esque songs, there’s a substantial probability that the next one they write will heavily overlap with one of those, and then they’ll be infringing, and then bad things happen.

The people running AI companies of course don’t care: apparently they think all jobs but their own are expendable. (I’m looking right at OpenAI’s Mira Murati, “Some creative jobs maybe will go away, but maybe they shouldn’t have been there in the first place.”)

I don’t know what to do about this, but I think that our current ways of thinking and our current legal mechanisms aren’t ready for it. And, unfortunately, the sociopaths running AI companies don’t care how much damage they do and how many people they hurt, so we would be foolish to rely on their self-restraint — they don’t have any. We need to find a new way to think about this, and having thought about it, we need to find a new way to codify this.

Anonymous Coward says:

Re:

I’m not sure that existing IP thinking will handle this. By “handle”, I mean “provide us with a framework to think about about this issue and perhaps legally codify it”.

It’s not going to, because this was never the goal. So long as the RIAA can funnel money from anywhere into the pockets of the middlemen, i.e. them, that’s all they actually care about. It doesn’t matter whether the money goes to the artist or the record label or the Martin Shkreli who happens to be holding onto the ownership rights. Because the RIAA is getting a cut.

I’m looking right at OpenAI’s Mira Murati, “Some creative jobs maybe will go away, but maybe they shouldn’t have been there in the first place.”

I’d absolutely agree that Murati sounds completely tone-deaf or intentionally psychopathic here. On the other hand, there are some jobs involved with the creative industry that shouldn’t have been there in the first place. Namely, the bulk of corporate lawyers poring painstakingly over chord progressions trying to find a case worth suing over.

Consider that Ed Sheeran might have to actually start recording his songwriting and music production sessions, just so he doesn’t get hit with another Marvin Gaye estate lawsuit. Consider that this might have to be normalized for every musical creator going forward. That’s going to involve a significant amount of labor and resources. Sure, it technically creates jobs. Sure, a lot of money was made by the legal system. But was it worth it? Is this behavior worth encouraging? Is this economic activity actually valuable or constructive?

Anonymous Coward says:

Re: Re:

Not really. Under the rules of your stricter copyright law, you’d have to pay that $10k up front to defend yourself. Also Meshpage would have to be taken offline until you’ve been proven innocent of infringement.

But we all know you won’t do it because you’re a hypocritical grifter.

terop (profile) says:

Re: Re: Re:

Also Meshpage would have to be taken offline until you’ve been proven innocent of infringement.

That wouldn’t be appropriate response to your invalid claims of superiority. The god complex that you possess gives you access to the mental institute front door, but not much anything more than that. Then a robot sent from the future to kill you will actually save your ass before the world explodes in a huge bang.

terop (profile) says:

Re: Re: Re:3

The response was in accordance with your rules of stricter copyright law

You don’t have viable alternative to this stricter copyright law stuff. (hint: viable keyword is kinda powerful and you’re unable to overcome its requirements)

in that all potential infringement must be preemptively prevented.

This is true statement. Leaving possibility for infringement is a failure of the process. Especially the copyright owner’s exclusive bits must be properly protected against pirate groups activities.

Happily it’s not as impossible to do than some pirates proclaim. If your tests indicate that encoding pirated material to the software input slots is possible, limiting the size of the input is perfectly good approach to cut more eregious infringements away from the consideration. This is what AI tools could be doing when they get sued by entertainment industry, if they just limit the amount of input to reasonable short snippets, then would-be copyright owners could identify to own only short snippets of material, and the full AI database would be free from infringement.

This comment has been flagged by the community. Click here to show it.

terop (profile) says:

Re: Re:

Who knew every would-be musician can never learn to play anyone else’s music because that’s “piracy!”

The exclusive author’s bits are DISPLAY, PERFORM, DISTRIBUTE.

guess what you wanted is a permission to do the perform -operation. And if you pay a little more, the display bit gives you permission to advertise the song with leaflets.

Anonymous Coward says:

These AI song generators can create almost complete copies of songs like All I Want For Christmas

And pray tell, who’s losing profits as a result of someone making AI generated Christmas carols… in June?

Funny how nobody screaming “piracy” can start pointing out damages besides imaginary numbers stacked together to demand settlements out of blind grandparents.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...