AI detection tools combat LLM hallucination challenges

[Adobe Stock]

Why can’t Big Tech’s billion-dollar AI stop making stuff up? Apple recently hit the brakes on its AI-generated news alerts after a BBC complaint pointed out a false and misleading headline. And as Apple has partnered with OpenAI push ChatGPT functionality into Siri, lawsuits over ‘hallucinated’ legal cases and viral confabulated ‘AI-enhanced’ search queries expose a major blind spot: even the latest algorithms can’t fact-check themselves consistently. While analysts project the AI market to balloon from hundreds of billions today to trillions by 2030, even giants like Google and OpenAI still can’t solve the basic problem of factual accuracy. We spoke to Copyleaks CEO Alon Yamin, whose company battles AI hallucinations, to understand why money and talent haven’t cured the epidemic of confabulated AI misinformation—and whether guardrails like his can survive the industry’s gold-rush scale-up.

To unravel why even Silicon Valley’s giants can’t fix AI’s tendency to make stuff up, we sat down with Alon Yamin, co-founder and CEO of Copyleaks, an AI-based text analysis company whose offerings include AI content detection and plagiarism detection. He explains how generative AI’s reliance on pattern prediction perpetuates misinformation, making reliable, real-time fact-checking prohibitively complex. He also details how Copyleaks’ detection and verification tools can help authenticate text and flag inaccuracies before they reach users.

Big tech companies already have significant resources and top-tier AI talent with top salaries. Why, in your view, are they still struggling to ensure factual accuracy and alignment when summarizing or generating news content?

Alon Yamin

Yamin: Despite having significant resources and talent, big tech companies face challenges when it comes to ensuring AI-generated news is factually accurate. This is largely due to the limitations of generative AI models, which are designed to predict text patterns rather than verify facts. Real-time fact-checking would require constant tweaks to the model’s knowledge base and data sources. This can be both complicated and costly, making consistency difficult.

Have seen pundits argue tech giants could simply “bolt on” an alignment or fact-checking module to ensure that AI-generated summaries stay faithful to the original story. What technical and practical barriers make this simpler-sounding solution so difficult in reality?

Yamin: While a fact-checking or alignment module may seem like a quick fix, it is much more complicated in practice. AI models need constant updates to their knowledge base and data, which is a task itself. Additionally, fact-checking involves understanding context, something that AI is currently limited with. While some fact checking tools are in place, they often don’t fully prevent AI hallucinations or check data biases.

False headlines, fictional legal citations, and incorrectly labeled historical events have all surfaced in AI outputs. What are the main drivers behind LLMs’ propensity to produce “confidently wrong” information?

Yamin: The main drivers behind LLMs’ tendency to produce “confidently wrong” information is their design and limitations. Many models are built to predict the most likely sequence of words, not to fact-check. LLMs can learn from datasets that may include biases or outdated information and as a result, models can confidently output incorrect facts without realizing they’re wrong.

Some in the field, including from OpenAI, an Apple partner, have at various points said that the hallucination issue was close to being solved while also noting later that the issue was more of a feature than a bug. It seems it is also one of the factors complicating AI adoption at the same time. How do you think this Apple development will shape the situation?

Yamin: Apple’s integration of AI tools like ChatGPT with Siri makes the issue of AI hallucinations even more pressing, especially since these tools are interacting with millions of users every day. While the issue has improved, it’s clear that hallucinations are not just a technical bug, but a fundamental challenge of AI models. By making AI more accessible to every day tech users, Apple and OpenAI are under public scrutiny and pressure to fix the hallucination issue. Apple will likely make efforts to improve reliability, but this highlights the need for transparency in order to build public trust in these models.

Copyleaks specializes in AI-based text analysis, including plagiarism detection and verifying content authenticity. How could your tools, or similar platforms, be integrated into these AI workflows to reduce misinformation and “hallucinations”?

Yamin: AI detection tools like Copyleaks play a significant role in content creation by ensuring authenticity and originality. As generative AI becomes more advanced, identifying AI-written content is essential for maintaining credibility and confidence in both content and AI. By integrating our plagiarism and copyright infringement detection and content verification capabilities into AI workflows, platforms could cross-check generated text, flagging errors in real-time. This could help identify false information or misrepresented facts generated by AI before they reach users. We allow AI visibility and transparency so that you always aware of when AI is being used while mitigating copyright and IP risks that are common with LLM output. Our mission is to allow anyone to leverage all AI benefits while mitigating its risks.

Related Articles Read More >

Researchers use ML to improve the resolution of the Webb telescope

A journalist created a startup run almost entirely by AI agents

Google Cloud’s Shweta Maniar on moving life-sciences AI from research to reality

Biohybrid robots deliver cancer treatments directly to tumors

Search R&D World