You’ll Laugh at This Simple Task AI Still Can’t Do
Researchers from the University of Edinburgh discovered that AI models, specifically multimodal large language models, struggle to read analog clock faces, with an accuracy rate of only 25 percent. In their study, they tested models from companies like Google, OpenAI, Anthropic, and others. The findings suggest that AI technology is still not adept at parsing clock faces, a skill that most human children acquire around ages six and seven. The research paper detailing these results is currently awaiting peer review.

Artificial Intelligence Struggles with Telling Time
Most human children learn how to tell time around ages six and seven — but artificial intelligence still, apparently, can't parse a clock face. Researchers from Scotland's University of Edinburgh have found that AI models that can process text and images — otherwise known as multimodal large language models, or MLLMs — could only read analog clock faces a pitiful 25 percent of the time.
In a paper that's awaiting peer review, the AI informatics researchers explained that Google's Gemini was the "best" of the crop when they tested out MLLMs from that company, OpenAI, Anthropic, and others to see how well they could read clock faces and yearly calendars. As they soon found, all of the models they tested seemed to be challenged by the "combination of spatial awareness, context, and basic math" required to read time and dates.
Researchers tested various clock designs, including some with Roman numerals, with and without second hands, and different colored dials. Their findings show that AI systems, at best, got clock-hand positions right less than a quarter of the time. Mistakes were more common when clocks had Roman numerals or stylized clock hands.
Challenges with Reading Calendars
When testing out how well the MLLMs handled calendars — specifically, ten years of the large annual kind, which show all 12 months of the year on one page — the researchers found that they were slightly better at reading dates than times, but only slightly. GPT-01, the first generation of OpenAI's reasoning models, ended up scoring the highest on the calendar challenge by getting the date questions right 80 percent of the time. Still, it answered one-fifth of the questions put to it incorrectly.
Rohit Saxena, the study's lead author, said in the school's press release that although "most people can tell the time and use calendars from an early age," AI seems, per the new research, to struggle to "carry out what are quite basic skills for people." These shortfalls must be addressed if AI systems are to be successfully integrated into time-sensitive, real-world applications, such as scheduling, automation, and assistive technologies.
AI's Struggles with Basic Tasks
As New Scientist reported more than three years ago, Oxford researchers found that when they trained their own AI model on analog clock faces and their correct readings, it was able to accurately tell the time between 74 and 84 percent of the time. The tension illustrates the current situation of AI: it can often ace difficult questions in heady domains like math and the law, but simultaneously continues to struggle with tasks as basic as telling the time.
Look no further than the tech giant Apple, which was forced to push back its ambitious plans to integrate AI into its voice assistant Siri this month. An AI that can respond to virtually any query makes a great tech demo, but if it struggles to set an alarm or schedule an appointment, you're going to have a lot of disappointed users on your hands — even at well-funded companies like OpenAI, Apple, and Google.
More on AI fails: Study Finds That AI Search Engines Are Wrong an Astounding Proportion of the Time. The post You’ll Laugh at This Simple Task AI Still Can’t Do appeared first on Futurism.