The chatbot that millions of people use to write term papers, computer code and fairy tales doesn’t just write words. ChatGPT, OpenAI’s artificial-intelligence-powered tool, can also analyze images — describing what’s in them, answering questions about them, and even recognizing faces of specific people. Could The hope is that, eventually, someone can upload a picture of a broken car engine or mysterious rash, and ChatGPT can suggest a fix.
OpenAI doesn’t want ChatGPT to be a facial recognition machine.
For the past few months, Jonathan Mosen has been among a select group of people who have access to an advanced version of the chatbot that can analyze images. On a recent trip, Mr. Mosen, chief executive of an employment agency who is blind, used visual analysis to determine which dispensers were for shampoo, conditioner and shower gel in a hotel room bathroom. This far surpassed the performance of the image analysis software they had used in the past.
“It told me the milliliter capacity of each bottle. It told me about the tiles in the shower,” Mr. Mosen said. “It described it all in such a way that a blind person needs to hear it. And with a picture, I had exactly the answers I needed.
For the first time, Mr Mosen has been able to “interrogate the images”, he said. He gave an example: The text accompanying an image he found on social media described her as a “happy-looking blonde woman.” When he asked ChatGPT to analyze the image, the chatbot said it was a woman in a dark blue shirt taking a selfie in a full-length mirror. He may ask follow-up questions, such as what kind of shoes she was wearing and what else was visible in the mirror’s reflection.
“It’s extraordinary,” said Mr Mosen, 54, who lives in Wellington, New Zealand, and has demonstrated the technology on a podcast he hosts.Live with your eyes closed,
In March, when OpenAI announced GPT-4The latest software model powering the AI chatbot, the company said, is “multimodal”, meaning it can respond to text and image prompts. While most users are only able to interact with the bot verbally, Mr. Mosen was given early access to visual analytics by Be My Eyes, a start-up that connects generally blind users with sighted volunteers and corporate Provides accessible customer service to. Customer. be my eyes Teamed up with OpenAI The chatbot’s “vision” will be tested before the feature is released to the general public this year.
Recently, the app stopped providing Mr Mosen with information about people’s faces, saying they were hidden for privacy reasons. He was frustrated, feeling that he should have the same access to information as a blind person.
The change reflected OpenAI’s concern that it had built something with power it didn’t want to be released.
The company’s technology can primarily identify public figures, such as people with Wikipedia pages, but has also been widely used in tools built to find faces on the Internet, such as Clearview AI and PymEyes, said OpenAI policy researcher Sandhyani Agarwal. do not work. , The tool could recognize Sam Altman, the chief executive of OpenAI, in photos, Ms Agarwal said, but not other people who worked at the company.
Making such a feature publicly available would push the boundaries of what is generally considered acceptable practice by US technology companies. It could also lead to legal trouble in jurisdictions such as Illinois and Europe, where companies are required to obtain citizens’ consent to use their biometric information, including faceprints.
Additionally, OpenAI worries that the tool will say things about people’s faces that it shouldn’t, such as judging their gender or emotional state. Ms Agarwal said OpenAI is figuring out how to address these and other security concerns before releasing the image analysis feature more widely.
“We really want this to be a two-way conversation with the public,” he said. “If what we hear is like, ‘Really we don’t want any of this,’ then that’s something we agree with a lot.,
In addition to feedback from Be My Eyes users, the non-profit arm of the company is also “trying to come up with ways to achieve that.”democratic inputTo help set the rules for AI systems.
Ms Aggarwal said the development of visual analytics was not “unexpected”, as the model was trained by looking at images and text collected from the internet. He pointed out that celebrity facial recognition software already exists, such as Google’s Tool, Google offers a exit For well-known people who don’t want to be known, and OpenAI is taking that approach.
Ms Aggarwal said OpenAI’s visual analysis can generate “hallucinations” similar to text prompts. He said, “If you give it a picture of someone on the verge of fame, it might sound like a name.” “Like if I give it a picture of a famous tech CEO, it can give me the name of a different tech CEO.”
Toole, he said, once falsely told Mr. Mosen about the remote control, confidently telling him that it had buttons that weren’t there.
Microsoft, which has invested $10 billion in OpenAI, also has access to visual analysis tools. Some users of Microsoft’s AI-powered Bing chatbot have noticed the feature in a limited rollout; After uploading photos to it, they received a message stating that “Privacy blurring hides faces from Bing Chat.”
Sayash Kapur, a computer scientist and doctoral candidate at Princeton University, used the tool to decode a CAPTCHA, a visual security check that is understandable only to the human eye. Even when breaking the code and recognizing two obscure words given, the chatbot notes that “CAPTCHAs are designed to prevent automated bots like me from accessing certain websites or services.”
“AI is blowing up everything that separates humans from machines,” said Ethan Malik, associate professor who studies innovation and entrepreneurship at the Wharton School of the University of Pennsylvania.
Since the visual analysis tool suddenly appeared in Mr Malik’s chatbot version of Bing last month – making him one of the few people with early access, without notice – he hasn’t turned off his computer for fear of losing it. He gave it a picture of the spices in the refrigerator and asked Bing to suggest recipes for those ingredients. It came with “Whipped Cream Soda” and “Creamy Jalapeno Sauce”.
Both OpenAI and Microsoft are aware of the power – and the potential privacy implications – of this technology. A Microsoft spokesperson said the company is “not sharing technical details” about face blurring but “working closely with our partners in OpenAI to uphold our shared commitment to the safe and responsible deployment of AI technologies”. is working.”