The race to stop the ‘worst-case scenario for machine learning’

The race to stop the 'worst-case scenario for machine learning'

Dave Wilner’s got a front row seat in the evolution of the worst things on the internet.

He started working at Facebook in 2008, when social media companies were still creating their own rules. As the company’s head of content policy, it was Willner who more than a decade ago wrote Facebook’s first official Community Standards, an informal one-page list of what he said were mostly “bans”. was limited to.hitler and the naked peopleThere is now a long list of abuses, crimes, and other bizarre questions that are banned on all platforms of Meta.

So last year, when the San Francisco artificial intelligence lab OpenAI was preparing to launch Dell-e, a tool that allows anyone to instantly create an image by describing it in a few words, the company hired Mr. Willner as its trust chief. Appointed. and safety. Initially, this meant examining all the images and signals that Dall-e’s filters flagged as potentially infringing – and finding ways to prevent potential infringers from succeeding.

It didn’t take long in this task when Mr. Wilner had to consider a familiar danger.

Just as child predators had used Facebook and other major tech platforms to disseminate images of child sexual abuse for years, they were now attempting to use Dall-E to create entirely new images. “I’m not surprised that this was something that people would try to do,” Mr. Willner said. “But to be very clear, neither were the folks at OpenAI.”

For all the recent talk about the hypothetical existential risks of generic AI, experts say it’s the immediate threat — child predators are already using new AI tools — that deserves the industry’s undivided attention.

one in newly published paper Researchers by the Stanford Internet Observatory and Thorne, a nonprofit that fights the spread of online child sexual abuse, found that, since last August, there has been a small but meaningful increase in the amount of photorealistic AI-generated child sexual abuse material . Dark Web.

According to Thorne’s researchers, this has for the most part manifested in fiction that uses the likeness of actual victims but portrays them in new poses, subjected to new and increasingly severe forms of sexual violence. The researchers found that most of these images were created not by Dell-e but by open-source tools that were developed and released with some security.

In their paper, the researchers report that less than 1 percent of child sexual abuse material found in a sample of known predatory communities appeared to be photorealistic AI-generated images. But given the rapid pace of development of these generative AI tools, researchers anticipate that number will only grow.

“Within a year, we’re going to get into a very problematic situation in this area,” said David Thiel, chief technologist at the Stanford Internet Observatory, who co-wrote the paper with Thorne’s director of data science, Dr. Rebecca Portnoff, and Thorne’s Head of Research, Melissa Strobel. “It’s the worst case scenario for machine learning I can think of.”

Dr. Portnoff has been working on machine learning and child safety for over a decade.

To him, the idea that a company like OpenAI is already thinking about the issue points to the fact that the field is on a steeper learning curve than the social media giants, at least in its early days. Is.

“Today the currency is different,” Dr. Portnoff said.

Still, she said, “If I could turn back the clock, it would have been a year ago.”

In 2003, Congress passed a law banning “computer-generated child pornography”—a rare example of congressional future-proofing. But at the time, creating such images was both prohibitively expensive and technically complex.

The cost and complexity of creating these images has been steadily declining, but that changed last August with the public debut of Stable Diffusion, a free, open-source text-to-image generator developed by London-based machine learning company Stability AI. .

In its initial iteration, stable diffusion imposed certain limits on the images its model could produce, including images of nudity. The company’s chief executive, Imad Mostaq, last told The New York Times, “We trust people, and we trust community.”

In a statement, Motez Bishara, communications director at Stability AI, said the company prohibited the misuse of its technology for “illegal or unethical” purposes, including the creation of child sexual abuse material. “We strongly support law enforcement efforts against those who misuse our products for illegal or nefarious purposes,” Mr. Bishara said.

Because the model is open-source, developers can download and modify the code on their own computers and use it to generate realistic adult pornography, among other things. In their paper, Thorne and researchers at the Stanford Internet Observatory found that predators altered those models so that they were also able to produce sexually explicit images of children. Researchers report a cleaner version of an AI-generated image of a woman has been modified until it looks like an image of Audrey Hepburn as a child.

Stability AI has since released filters that attempt to block what the company calls “unsafe and inappropriate content”. And new versions of the technology were created using data sets that exclude ingredients deemed “not safe for work”. But, according to Mr. Thiel, people are still using the old model to produce imagery that the new model prohibits.

Unlike Stable Diffusion, Dall-E is not open-source and is only accessible through OpenAI’s own interface. The model was also developed with several more safeguards to prohibit the creation of legal nude images of adults. “Models themselves have a tendency to refuse to have sex with you,” Mr Willner said. “We do this mostly because of the discretion around these deeply sexual topics.”

The company also implemented guardrails early on to prevent people from using certain words or phrases in their Dall-E prompts. But Mr Willner said hunters still try to cheat the system by using what the researchers call “visual synonyms” – creative terms to avoid railing when describing the images they want to generate.

“If you remove the model’s knowledge of what blood looks like, it still knows what water looks like, and it knows what the color red is,” Mr Willner said. “This problem also exists for sexual content.”

Thorne has a tool called Safer, which scans images for child abuse and helps companies report them to the National Center for Missing and Exploited Children, a federally designated clearinghouse of suspected child sexual abuse material. runs the house. OpenAI uses Secure to scan the content people upload to Dall-E’s editing tools. This is useful for capturing real images of children, but Mr Willner said even the most sophisticated automated tools can struggle to accurately identify AI-generated imagery.

It’s an emerging concern among child safety experts: AI will be used to create not only new images of real children, but also apparent images of children who don’t exist.

That content in itself is illegal and will need to be reported. But the prospect has also raised concerns that the federal clearinghouse could be flooded with fake photos, complicating efforts to identify real victims. Last year alone, the center’s cyber tipline received nearly 32 million reports.

“Will we know if we start getting reports? Will they be tagged or separated from images of real children? said Yota Souras, general counsel for the National Center for Missing and Exploited Children.

At least some of those answers should come not only from AI companies, such as OpenAI and Stability AI, but also from companies that run messaging apps or social media platforms, such as Meta, which is Cyber ​​Tipline’s top reporter.

Last year, over 27 million Advice Came from Facebook, WhatsApp and Instagram alone. Already, tech companies use a classification system developed by the industry coalition tech allianceCategorizing suspected child sexual abuse material based on the apparent age of the victim and the nature of the acts depicted. In their paper, Thorne and the Stanford researchers argue that these classifications should be broadened to reflect whether an image was computer-generated.

In a statement to The New York Times, Antigone Davis, Meta’s global head of security, said, “We are working to be objective and evidence-based in our approach to AI-generated content, such as understanding whether When would the inclusion of identifying information be most beneficial and how to communicate that information.” Ms. Davis said the company would work with the National Center for Missing and Exploited Children to determine the best way forward.

Beyond the platform’s responsibilities, the researchers argue that AI companies themselves can do much more. Specifically, they can train their models not to generate images of child nudity and to clearly recognize images generated by artificial intelligence as they make their way across the Internet. This will mean baking watermarks into images that are more difficult to remove than images already implemented by Stability AI or OpenAI.

As lawmakers consider regulating AI, experts are looking to mandate some form of watermarking or provenance, not only on child sexual abuse material but also as a key to fighting misinformation.

“You’re only as good as the lowest common denominator here, which is why you want a regulatory regime,” said Hany Farid, professor of digital forensics at the University of California, Berkeley.

Professor Farid is responsible for developing PhotoDNA, a tool launched by Microsoft in 2009 that is now used by many tech companies to automatically find and block known child sexual abuse imagery. Mr Farid said the technology giants were too slow to implement the technology after it was developed, allowing the scourge of child sexual abuse material to spread openly for years. He is currently working with several technology companies to create a new technology standard for detecting AI-generated imagery. Stability AI is one of the companies planning to implement this standard.

Another open question is how the court system will deal with cases brought against creators of AI-generated child sexual abuse material – and what liability AI companies will face. Although legislation against “computer-generated child pornography” has been on the books for two decades, it has never been tested in court. An earlier law that tried to ban what was then called virtual child pornography was struck down by the Supreme Court in 2002 for infringing on speech.

Members of the European Commission, the White House and the US Senate Judiciary Committee have been briefed on Stanford and Thorne’s findings. It’s important, Mr. Thiel said, that companies and lawmakers find answers to these questions before technology advances to include things like full motion video. “We have to get it in advance,” Mr. Thiel said.

Thorne’s chief executive Julie Cordua said the researchers’ findings should be seen as a warning – and an opportunity. Ms. Cordua argues that unlike those social media giants who woke years later to how their platforms were enabling child predators, AI-generated child abuse could be prevented from spiraling out of control. there’s still time.

“We know what these companies should be doing,” Ms. Cordua said. “We just need to do this.”

Source link

Leave a Comment

Your email address will not be published. Required fields are marked *

1 + 1 =