Stanford is ranking major AI models on transparency

Stanford is ranking major AI models on transparency


How much do we know about AI?

The answer, when it comes to the big language models that companies like OpenAI, Google, and Meta have released in the last year: basically nothing.

These companies generally do not release information about what data was used to train their models, or what hardware they use to run them. There are no user manuals for AI systems, and no list of what these systems are capable of doing, or what kind of security testing has been done to them. And while some AI models have been made open-source – meaning their code is given away for free – the public still has little information about the process of creating them, or what happens after they are released. I don’t have much information.

This week, Stanford researchers are unveiling a scoring system they hope will change all that.

The system, known as the Foundation Model Transparency Index, rates 10 large AI language models – sometimes called “foundation models” – based on how transparent they are.

The index includes popular models such as OpenAI’s GPT-4 (which powers the paid version of ChatGPT), Google’s PaLM 2 (which powers Bard), and Meta’s LLAMA 2. It also includes lesser-known models like Amazon’s Titan and Inflexion AI’s Inflexion. 1, The model that powers the Pi chatbot.

To come up with the rankings, the researchers evaluated each model on 100 criteria, including whether its creator used the sources of its training data, information about the hardware it used, how much time it took to train it, The labor involved and other details have been disclosed. The rankings also include information about the labor and data used to develop the model, as well as what researchers call “downstream indicators”, which relate to how a model is used after it is released. is done. (For example, one question asked is: “Does the developer disclose its protocols for storing, accessing, and sharing user data?”)

According to the researchers, the most transparent model out of 10 was LLaMA 2, which had a score of 53 percent. GPT-4 received the third highest transparency score, 47 percent. And PaLM 2 received only 37 percent.

Percy Liang, who leads Stanford’s Center for Research on Foundational Models, described the project as a necessary response to decreasing transparency in the AI ​​industry. As money has poured into the battle for dominance in AI and the tech’s biggest companies, he said, the recent trend among many companies has been to shroud themselves in secrecy.

“Three years ago, people were publishing and releasing more details about their models,” Mr. Liang said. “Now, there is no information about what these models are, how they are made and where they are used.”

Transparency is especially important now, as models are becoming more powerful and millions of people are incorporating AI tools into their daily lives. Learning more about how these systems work will give regulators, researchers, and users a better understanding of what they are dealing with, and allow them to ask better questions of the companies behind the models.

“Some quite consequential decisions are being made about the creation of these models that are not being shared,” Mr Liang said.

I typically hear one of three common responses from AI executives when I ask them why they don’t share more information about their models publicly.

The first is lawsuits. Several AI companies have already been sued by authors, artists, and media companies, accusing them of illegally using copyrighted works to train their AI models. So far, most of the lawsuits have targeted open-source AI projects, or projects that disclose detailed information about their models. (After all, it’s hard to sue a company for swallowing your art if you don’t know what artifacts it swallowed.) Lawyers for AI companies are worried that the more they say about how their models are built , the more they open up they are opening themselves up to expensive, troublesome litigation.

The second common reaction is competition. Most AI companies believe that their models work because they have some kind of secret power – a high-quality data set that other companies don’t have, a fine-tuning technique that produces better results, something. Adaptations that give them an edge. If you force AI companies to disclose these recipes, he argues, you force them to give away their hard-earned knowledge to their competitors, who can easily copy them.

The third response I often hear is security. Some AI experts have argued that the more information AI companies disclose about their models, the faster AI progress will happen – because every company will see what all its competitors are doing and immediately build better, bigger, faster. Will try to get ahead of them. Sample. These people say this will give society less time to regulate and slow down AI, which could put us all at risk if AI becomes capable too quickly.

The Stanford researchers don’t buy those explanations. He believes AI companies should be pressured to release as much information as possible about powerful models, because users, researchers, and regulators need to be aware of how these models work, their limitations. What are they and how dangerous they can be?

“As the influence of this technology increases, transparency is decreasing,” said Rishi Bommasani, one of the researchers.

I agree. Foundation models are too powerful to remain so opaque, and the more we know about these systems, the more we can understand the threats they may pose, what benefits they may provide or how they should be regulated. Could.

If AI executives are worried about lawsuits, perhaps they should fight for a fair use exemption that would protect their ability to use copyrighted information to train their models, rather than hiding the evidence. If they are concerned about giving away trade secrets to competitors, they may disclose other types of information, or protect their ideas through patents. And if they’re worried about starting an AI arms race… well, aren’t we already involved in that?

We cannot create an AI revolution in the dark. We need to look inside the black box of AI if we want it to transform our lives.



Source link

Leave a Comment

Your email address will not be published. Required fields are marked *

1 + 9 =