News publishers have argued over the past year that AI chatbots like ChatGPT rely on copyrighted articles to power the technology. Publishers now say that the developers of these tools use news content inconsistently.
The News Media Alliance, a trade group that represents more than 2,200 publishers, including The New York Times, released Research On Tuesday it said it shows that developers value articles over general online content to train the technology, and chatbots reproduce sections of some articles in their responses.
The group argued that the findings show that AI companies violate copyright law.
“This is an extension of an existing problem,” said Danielle Coffey, president and chief executive of the News Media Alliance, which has been arguing for years that tech companies like Google don’t compensate news organizations fairly for displaying their work on online services. Gives.
Representatives from Google and OpenAI, the maker of ChatGPT, did not immediately respond to requests for comment.
Generative artificial intelligence, the technology behind chatbots, went mainstream late last year with the release of ChatGPT, a chatbot that can answer questions or complete tasks using information obtained from the Internet and elsewhere. . Other tech companies have since released their own versions.
It is impossible to know what data is fed into the big learning models because many have not publicly confirmed what has been used. In its analysis, the News Media Alliance compared the public data sets used to train the best-known large language models that underpin AI chatbots like ChatGPT, with those of general content scraped from the web. With open-source data sets.
The group found that the curated data set had five to 100 times more use of news content than the general data set. Ms. Coffey said those results showed that people building AI models value quality content.
The report also found examples of models directly reproducing language used in news articles, which Ms Coffey said meant copies of publishers’ content were kept for use by chatbots. He said the output of chatbots competes with news articles.
“It really acts as a replacement for our work,” Ms. Coffey said, “You see our articles are just taken and repeated verbatim.”
News Media Alliance has Presented Findings from the report of the US Copyright Office’s study of AI and copyright law.
“It shows that we will have a very good case in court,” Ms. Coffey said.
Ms. Coffey said the News Media Alliance is actively exploring collective licensing of content from its members, which include some of the nation’s largest news and magazine publishers.
Media executives have raised several concerns about AI, including the use of articles to train language models. Some executives fear that if chatbots become the primary search tool, search engines could reduce traffic to news sites. Apart from this there are many media persons also Concerned that they can be replaced by AI