Wenbo’s job is to listen to what people in China say to their voice-activated smart assistants as part of his work for a leading Chinese artificial intelligence company.
Every shift, Wenbo, who uses a wheelchair, sorts through audio clips, categorizing them based on things like the presence of speech or other sounds, the presumed age or gender of the speaker, and whether or not the person used the device’s “wake word” (like “Hey Google” or “Alexa” for U.S. smart assistants).
Wenbo, in other words, is playing a crucial role in the development of one of the world’s most promising technologies. Around the globe, hundreds of thousands of people like him — many of them independent contractors with few protections — create and organize the datasets that allow self-driving cars to recognize raindrops, help ChatGPT avoid spewing racial slurs, and ensure Facebook’s algorithms don’t mistake a pepperoni slice for a female nipple.
Despite their importance, tech companies rarely acknowledge how much they rely on data annotators (sometimes called labelers or content moderators). The audio clips are sent to Wenbo because artificial intelligence technology isn’t sophisticated enough to recognize what’s happening in them, perhaps because the speaker has a thick accent, or what they said was confusing or unclear. The goal is to teach the AI model how best to make sense of these edge cases, so that it becomes more accurate in the future.
A new study in the peer-reviewed journal First Monday shows why the data annotation industry shouldn’t be ignored or discussed only in the context of unfair labor practices. If the world keeps overlooking the role it plays in manufacturing AI, it will be unable to understand how complex models really work. People will likely believe instead in a popular illusion about AI: that it is solely the creation of elite engineers in places like Silicon Valley or Beijing.
The reality is that it is also built using the specialized labor of Wenbo and hundreds of thousands of his colleagues. The tiny decisions and judgment calls they make shape things like, say, the tone or rhythm of a chatbot’s replies.
As part of her research, Di Wu, a PhD candidate at the Massachusetts Institute of Technology, interviewed Wenbo and over a dozen other annotators working for a unique program run by a disabled persons’ organization in China she calls ENABLE. (That name, as well as the ones given to all of her interviewees, are pseudonyms used to protect them from political or economic blowback for speaking to a researcher.)
ENABLE helped Wenbo and his colleagues negotiate with the AI company for accommodations to make their jobs easier. For example, they asked for text-based datasets to be compatible with screen readers — software that converts written words into audio files — so that they could be interpreted by annotators who are blind or have limited vision.
The AI company was happy to comply, because the work ENABLE produces “recently outperformed many non-disabled competitors” and became one of the firm’s major data annotation service providers, Wu writes. One of the reasons ENABLE is so successful is that the workers have honed their experience over relatively long periods of time.
That might sound like an obvious advantage in any job, but data annotation is often misunderstood as mindless “click work” or simply teaching an AI about objective “human preferences.” But as previous research has shown, annotators incorporate their own individual experiences, and must also work within the structures tech companies have designed.
Multiple ENABLE annotators told Wu that they frequently heard different things in the audio clips than the quality assurance (QA) reviewers who checked their work. To get it approved, they trained themselves to be attuned to the ears of the QAs.
“There is no standard,” Wenbo told Wu. “For things like sound, everyone’s ears are different, and everyone’s accents are different.” For example, if someone spoke the smart assistant’s wake word quickly, Wu might find it clear enough to pass, but the QA could easily disagree.
If companies developing AI want to uncover the biases in their models, they will need to take the work done by data annotators seriously, and give them opportunities to provide feedback, which is what happens at ENABLE. The annotators meet regularly with developers at the AI firm, Wu writes, where they discuss trends, emerging problems, and recommendations to improve the system.
Julian Posada, an incoming professor at Yale University who has studied the data annotation industry, said this is not the norm. In many cases, data annotators work as contractors on third-party platforms, where there is no way to communicate directly with their clients.
“What if the majority of workers don’t understand the guidelines properly? Then you will be giving the greenlight to things that are not properly data-annotated,” said Posada. “One of the problems that the platform industry faces is bias from ground truth data.” (Ground truth is the term AI engineers use to describe the target objective they want a model to achieve.)
Room for Disagreement
Data annotators will eventually eliminate their own jobs. Each time AI becomes more advanced, it needs less or different kinds of human input to function. Its inherent temporality is a major reason why the data annotation industry may continue to struggle for recognition. “All these jobs, all these things, are produced in a way that’s not for the long term,” said Posada.
In recent years, government agencies and technology companies in China have set up a number of programs for disabled people to conduct data annotation work, Wu writes. The initiatives allow firms to fulfill a disability employment quota required by the government, and are often branded as “tech for good.”
But in ENABLE’s case, the AI company has not claimed a quota by hiring its workers, and they are not paid less than their peers without disabilities in similar roles.
- In 2021, China’s Ministry of Human Resources and Social Security released occupational standards for the data annotation sector, including skill requirements, reported the Chinese online magazine Sixth Tone.
- Rest of World traced how self-driving car companies helped spur the professionalization of the data annotation industry. More platforms now have “quality control measures to ensure jobs for autonomous vehicle clients come back with very few mistakes,” wrote reporter Vittoria Elliott.