<
div class=”field field–name-body field–type-text-with-summary field–label-hidden”>
<
div class=”field__items”>
<
div class=”field__item even”>
California legislators have begun debating a bill (A.B. 412) that would require AI developers to track and disclose every registered copyrighted work used in AI training. At first glance, this might sound like a reasonable step toward transparency. But it’s an impossible standard that could crush small AI startups and developers while giving big tech firms even more power.
A Burden That Small Developers Can’t Bear
The AI landscape is in danger of being dominated by large companies with deep pockets. These big names are in the news almost daily. But they’re far from the only ones – there are dozens of AI companies with fewer than 10 employees trying to build something new in a particular niche.
This bill demands that creators of any AI model–even a two-person company or a hobbyist tinkering with a small software build– identify copyrighted materials used in training. That requirement will be incredibly onerous, even if limited just to works registered with the U.S. Copyright Office. The registration system is a cumbersome beast at best–neither machine-readable nor accessible, it’s more like a card catalog than a database–that doesn’t offer information sufficient to identify all authors of a work, much less help developers to reliably match works in a training set to works in the system.
Even for major tech companies, meeting these new obligations would be a daunting task. For a small startup, throwing on such an impossible requirement could be a death sentence. If A.B. 412 becomes law, these smaller players will be forced to devote scarce resources to an unworkable compliance regime instead of focusing on development and innovation. The risk of lawsuits—potentially from copyright trolls—would discourage new startups from even attempting to enter the field.
A.I. Training Is Like Reading And It’s Very Likely Fair Use
A.B. 412 starts from a premise that’s both untrue and harmful to the public interest: that reading, scraping or searching of open web content sho
[…]
Content was cut in order to protect the source.Please visit the source for the rest of the article.
Read the original article: