AI Hallucinations Can Prove Costly

March 13, 2025By Unknown Author|Source: Informationweek|Read Time: 4 mins|Share

Artificial intelligence hallucinations can lead to erroneous decisions, causing significant business setbacks. It's crucial for CIOs and IT leaders to understand the implications of AI glitches on their operations. They should take proactive measures to mitigate potential risks associated with AI misinterpretations. It's also important to continuously monitor AI systems to promptly detect and rectify any anomalies. Lastly, organizations should invest in AI ethics and governance to ensure responsible use of this technology.

AI Hallucinations Can Prove Costly — Representational image

Introduction to Large Language Models and Generative AI

Large language models (LLMs) and generative AI are fundamentally changing the way businesses operate -- and how they manage and use information. They’re ushering in efficiency gains and qualitative improvements that would have been unimaginable only a few years ago. But all this progress comes with a caveat. Generative AI models sometimes hallucinate. They fabricate facts, deliver inaccurate assertions and misrepresent reality.

The resulting errors can lead to flawed assessments, poor decision-making, automation errors and ill will among partners, customers and employees. “Large language models are fundamentally pattern recognition and pattern generation engines,” points out Van L. Baker, research vice president at Gartner. “They have zero understanding of the content they produce.” Adds Mark Blankenship, director of risk at Willis A&E: “Nobody is going to establish guardrails for you. It’s critical that humans verify content from an AI system. A lack of oversight can lead to breakdowns with real-world repercussions.”

False Promises and Real-world Repercussions

Already, 92% of Fortune 500 companies use ChatGPT . As GenAI tools become embedded across business operations -- from chatbots and research tools to content generation engines -- the risks associated with the technology multiply.

“There are several reasons why hallucinations occur, including mathematical errors, outdated knowledge or training data and an inability for models to reason symbolically,” explains Chris Callison-Burch, a professor of computer and information science at the University of Pennsylvania. For instance, a model might treat satirical content as factual or misinterpret a word that can have different contexts. Regardless of the root cause, AI hallucinations can lead to financial harm, legal problems, regulatory sanctions, and damage to trust and reputation that ripples out to partners and customers.

Examples of AI Hallucinations

In 2023, a New York City lawyer using ChatGPT filed a lawsuit that contained egregious errors, including fabricated legal citations and cases. The judge later sanctioned the attorney and imposed a $5,000 fine. In 2024, Air Canada lost a lawsuit when it failed to honor the price its chatbot quoted to a customer. The case resulted in minor damages and bad publicity.

At the center of the problem is the fact that LLMs and GenAI models are autoregressive, meaning they arrange words and pixels logically with no inherent understanding of what they are creating. “AI hallucinations, most associated with GenAI, differ from traditional software bugs and human errors because they generate false yet plausible information rather than failing in predictable ways,” says Jenn Kosar, US AI assurance leader at PwC.

Managing the Risks of AI Hallucinations

Although there’s no simple fix for AI hallucinations, experts say that business and IT leaders can take steps to keep the risks in check. “The way to avoid problems is to implement safeguards surrounding things like model validation, real-time monitoring, human oversight and stress testing for anomalies,” Kosar says.

Training models with only relevant and accurate data is crucial. In some cases, it’s wise to plug in only domain-specific data and construct a more specialized GenAI system, Kosar says. In some cases, a small language model (SLM) can pay dividends. For example, “AI that’s fine-tuned with tax policies and company data will handle a wide range of tax-related questions on your organization more accurately,” she explains.

Future of Large Language Models and Generative AI

Equally important is tracking the rapidly evolving LLM and GenAI spaces and understanding performance results across different models. At present, nearly two dozen major LLMs exist, including ChatGPT, Gemini, Copilot, LLaMA, Claude, Mistral, Grok, and DeepSeek. Hundreds of smaller niche programs have also flooded the app marketplace. Regardless of the approach an organization takes, “In early stages of adoption, greater human oversight may make sense while teams are upskilling and understanding risks,” Kosar says.

Fortunately, organizations are becoming savvier about how and where they use AI, and many are constructing more robust frameworks that reduce the frequency and severity of hallucinations. At the same time, vendor software and open-source projects are maturing. Concludes Blankenship: “AI can create risks and mitigate risks. It’s up to organizations to design frameworks that use it safely and effectively.”