Artificial intelligence developers heavily rely on illegally scraping copyrighted material from news publications and journalists to train their models, a news industry group has claimed.
On Oct. 30, the News Media Alliance (NMA) published a 77-page white paper and accompanying submission to the United States Copyright Office that claims the data sets that train AI models use significantly more news publisher content compared to other sources.
As a result, the generations from AI “copy and use publisher content in their outputs” which infringes on their copyright and puts news outlets in competition with AI models.
“Many generative AI developers have chosen to scrape publisher content without permission and use it for model training and in real-time to create competing products,” NMA stressed in an Oct. 31 statement.
On Monday, the News/Media Alliance published a White Paper and a technical analysis and submitted comments to the @CopyrightOffice on the use of publisher content to power generative artificial intelligence technologies (#GAI). https://t.co/Zr05e7nZTS
The group argues while news publishers make investments and take on risks, AI developers are the ones rewarded “in terms of users, data, brand creation, and advertising dollars.”
Reduced revenues, employment opportunities and tarnished relationships with its viewers are other setbacks publishers face, the NMA noted its submission to the Copyright Office.
To combat the issues, the NMA recommended the Copyright Office declare that using a publication’s content to monetize AI systems harms publishers. The group also called for various licensing models and transparency measures to restrict the ingestion of copyrighted materials.
The NMA also recommends the Copyright Office
Read more on cointelegraph.com