Artificial Intelligence, or AI for short, has been touted as a revolutionary technology with the potential to automate many jobs and transform the way we work. However, a recent benchmark suggests that even the most advanced AI agents are woefully inadequate when it comes to performing freelance work.
Researchers at Scale AI and the Center for AI Safety (CAIS) recently developed a new benchmark that measures an AI agent's ability to automate economically valuable work. The experiment involved giving several leading AI agents a range of simulated freelance tasks, including graphic design, video editing, game development, and administrative chores.
The results were stunningly underwhelming. Even the best AI agents were able to perform less than 3% of the work, earning a paltry $1,810 out of a possible $143,991. The most capable AI agent in the experiment was Manus from a Chinese startup, followed closely by Grok from xAI, Claude from Anthropic, ChatGPT from OpenAI, and Gemini from Google.
"It's hard to see how this is going to change much anytime soon," says Dan Hendrycks, director of CAIS. "We've been talking about AI replacing humans for jobs for years, but most of that has been theoretical or hypothetical."
The researchers acknowledge that their benchmark is not a perfect measure of an AI agent's economic impact, as many professions include tasks not covered by the measure. Nevertheless, the findings offer a sobering reminder that AI is unlikely to be stepping into vacated roles anytime soon.
Meanwhile, speculation about AI surpassing human intelligence and replacing vast numbers of workers continues to gain momentum. In March, Dario Amodei, CEO of Anthropic, suggested that 90% of coding work would be automated within months. However, the latest benchmark suggests that this is unlikely to happen anytime soon.
As one researcher notes, "They don't have long-term memory storage and can't do continual learning from experiences. They can't pick up skills on the job like humans." The idea that AI is already taking jobs is gaining traction, however, with Amazon recently announcing plans to cut 14,000 jobs in part due to the rapid rise of generative artificial intelligence.
It's clear that while AI has the potential to transform many aspects of our work lives, it's unlikely to be a silver bullet for job replacement anytime soon.
Researchers at Scale AI and the Center for AI Safety (CAIS) recently developed a new benchmark that measures an AI agent's ability to automate economically valuable work. The experiment involved giving several leading AI agents a range of simulated freelance tasks, including graphic design, video editing, game development, and administrative chores.
The results were stunningly underwhelming. Even the best AI agents were able to perform less than 3% of the work, earning a paltry $1,810 out of a possible $143,991. The most capable AI agent in the experiment was Manus from a Chinese startup, followed closely by Grok from xAI, Claude from Anthropic, ChatGPT from OpenAI, and Gemini from Google.
"It's hard to see how this is going to change much anytime soon," says Dan Hendrycks, director of CAIS. "We've been talking about AI replacing humans for jobs for years, but most of that has been theoretical or hypothetical."
The researchers acknowledge that their benchmark is not a perfect measure of an AI agent's economic impact, as many professions include tasks not covered by the measure. Nevertheless, the findings offer a sobering reminder that AI is unlikely to be stepping into vacated roles anytime soon.
Meanwhile, speculation about AI surpassing human intelligence and replacing vast numbers of workers continues to gain momentum. In March, Dario Amodei, CEO of Anthropic, suggested that 90% of coding work would be automated within months. However, the latest benchmark suggests that this is unlikely to happen anytime soon.
As one researcher notes, "They don't have long-term memory storage and can't do continual learning from experiences. They can't pick up skills on the job like humans." The idea that AI is already taking jobs is gaining traction, however, with Amazon recently announcing plans to cut 14,000 jobs in part due to the rapid rise of generative artificial intelligence.
It's clear that while AI has the potential to transform many aspects of our work lives, it's unlikely to be a silver bullet for job replacement anytime soon.