Cade Metz‘s recent New York Times article serves as an important reminder of the limitations that accompany AI chatbots.

As everyone marveled at the prowess of AI-driven platforms like OpenAI’s ChatGPT, many were slow to realize that these chatbots made up a lot of the information.

They still do, in what is widely know as hallucinations.

Per Metz, we’re now beginning to understand the amount and the why.

A new start-up called Vectara, founded by former Google employees, is trying to figure out how often chatbots veer from the truth. The company’s research estimates that even in situations designed to prevent it from happening, chatbots invent information at least 3 percent of the time — and as high as 27 percent

By far and away the most widely used chatbot, OpenAI being the 3 percent.

One of the methodologies employed by Vectara to gauge the reliability of these chatbots was straightforward yet revealing—summarizing news articles to check for factual integrity.

The results confirm their apprehensions: these AI systems, even in simple tasks like summarization, tend to introduce falsehoods, raising concerns about their performance in more complex applications. 

Efforts are underway by AI developers to curtail these hallucinations, but the challenge remains substantial. The crux of the problem lies in the very architecture of these AIs: language models trained on the internet’s vast and varied content, which is riddled with inaccuracies. 

I wouldn’t stop using AI chatbots in your publishing and writing because of hallucinations. There is too much to gain in having an AI powered writing assistance – on my multiple fronts. The key is to be vigilant.

While I use ChatGPT and Lou for their unmatched efficiency, I remain alert and verify the information they provide, particularly when summarizing news stories and blog posts.