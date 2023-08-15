ChatGPT creator OpenAI has been using its most advanced large language model to enforce the company’s content policies, marking a major milestone in the capabilities of the technology, the firm revealed Tuesday.

Lilian Weng, OpenAI’s head of safety systems, said in an interview with Semafor that the method could also be used to moderate content on other platforms for social media and e-commerce. That job is currently done mostly by armies of workers, often located in developing countries, and the task can be grueling and traumatizing.

“I want to see more people operating their trust and safety, and moderation [in] this way,” Weng said. “This is a really good step forward in how we use AI to solve real world issues in a way that’s beneficial to society.”

The method OpenAI used to get GPT-4 to police itself is as simple as it is powerful. First, a comprehensive content policy is fed into GPT-4. Then its ability to flag problematic content is tested on a small sample of content. Humans review the results and analyze any errors. The policy team then asks the model to explain why it made the errant decisions. That information is then used to further refine the system.

“It reduces the content policy development process from months to hours, and you don’t need to recruit a large group of human moderators for this,” Weng said.

The technique is still not as effective as experienced human moderators, OpenAI found. But it outperforms moderators that have had light training.

If the method proves successful on other platforms, it could lead to a major shift in how companies handle problematic online content, from disinformation to child pornography. Weng said OpenAI is also researching how to expand the capabilities beyond text to images and video.

An OpenAI spokeswoman said the company knows of some customers already using GPT-4 for content moderation, but they did not give permission to be named.