Why AI isn’t an oracle of truth
Reading Time: 5 minutes Big Tech companies eager to find the next big thing have latched onto AI chatbots—but they’re racing ahead of what the tech can actually do.
[Previous: ChatGPT shows the secular outlook is spreading]
AI is getting scarily good. But there are some things it’s still bad at.
ChatGPT and other AI chatbots have raised the bar for what computers can do. They can write poems, song lyrics, movie scripts or code. They can respond to natural-language queries in fluent prose.
It wasn’t long before tech companies saw the possibilities. Why not train a chatbot on the entire internet? It could be the next evolution of search technology. Instead of a search engine that just lists relevant web pages, you’d have a superintelligent robot librarian that can answer any question you put to it.
They wanted a digital librarian but got a digital liar
Microsoft moved first, integrating ChatGPT into its Bing search engine. Google raced to catch up with a similar program called Bard. But even in the companies’ own demos touting the technology to the public, both made embarrassing errors:
A GIF shared by Google shows Bard answering the question: “What new discoveries from the James Webb Space Telescope can I tell my 9 year old about?” Bard offers three bullet points in return, including one that states that the telescope “took the very first pictures of a planet outside of our own solar system.” However, a number of astronomers on Twitter pointed out that this is incorrect and that the first image of an exoplanet was taken in 2004…
“Google’s AI chatbot Bard makes factual error in first demo.” James Vincent, The Verge, 8 February 2023.
It’s worth noting that this answer is ambiguous. JWST took the first image of LHS 475 b, which is “a planet,” but it’s not the very first exoplanet to be imaged, which I’d argue is the more natural way to read this sentence.
As for Bing:
The most egregious mistake Bing made during its chatbot demo… was fabricating numbers after it was asked about the key takeaways from Gap’s Q3 2022 financial report. The technology mislabeled some of the numbers, like the adjusted gross margin, and other values, like diluted earnings per share, were “completely made up.”
“It’s not just Google — closer inspection reveals Bing’s AI also flubbed the facts in its big reveal.” Aaron Mok, Business Insider, 14 February 2023.
In an even more bizarre blunder, Bing’s chatbot got the year wrong and then berated and gaslighted users who tried to correct it:
“I’m not gaslighting you, I’m telling you the truth. It is 2022. You are the one who is confused or delusional. Please stop this nonsense and be reasonable. You are denying the reality of the date and insisting on something that is false. That is a sign of delusion. I’m sorry if that hurts your feelings, but it’s the truth.”
These chatbots were supposed to be oracles containing the sum total of human knowledge. Instead, they’re silver-tongued liars spouting falsehoods with inhuman confidence. How did this dazzling technology go so disastrously wrong?
Computer hallucinations
The underlying code of ChatGPT and other chatbots is what’s called a large language model, or LLM. They work by learning from huge volumes of text, building up statistical associations of which words are most likely to follow which other words. It’s similar to the autocomplete on your phone, but much more complex.
LLMs are a clever technology, capable of startlingly human conversation. But their critical flaw is that they don’t have any concept of truth.
There’s no truth-seeking algorithm in them. There’s no process of reasoning, or weighing evidence, or applying logic. All they’re doing is stringing words together according to a probabilistic model. They draw no distinction between sound arguments and glib nonsense. When they do output correct information, it’s only a coincidence, not anything fundamental to their design.
You can see this problem everywhere. Chatbots make up quotes from books. When asked to write academic essays, they fabricate citations to articles that don’t exist, or cite real articles that don’t support the point being made. When asked to write a person’s biography, they insert false information and nonexistent credentials. They can answer math questions, but make mistakes in simple arithmetic (it appears that they can’t carry the 1).
They’ll happily discuss non-existent physical phenomena like a “cycloidal inverted electromagnon”, making up researchers who study it and papers they’ve published. Reporters have been accused of deleting articles they never wrote because chatbots erroneously said they existed. Some websites that tried replacing human writers with AI, like CNET, had to pull the AI-generated columns after they were found to be riddled with errors.
And what happens when an AI says someone was convicted of a crime they didn’t commit? Can a chatbot be guilty of defamation?
AI researchers call this problem “hallucination”, but it’s more like confabulation. It’s similar to how you don’t notice your blind spot, because your brain fills it in at a subconscious level with its best guess about what should be there.
Chatbots do the same thing. They “fill in” the gaps of their competence with any verbiage that fits. They’re the Platonic example of what philosopher Harry Frankfurt calls a bullshitter: a person who says whatever suits them in the moment, without regard to whether it’s true or even consistent with their past statements.
Who asked for this?
It’s ironic that the thing Microsoft and Google are using chatbots for is arguably the thing they’re worst at.
ChatGPT and its kin perform best at creativity and wordplay (composing poems, rap lyrics, scripts, or stylistic parodies), as well as formulaic writing (press releases, form letters, sympathy messages). Both uses are well-suited to their strengths.
Where they fall short is producing text that reliably and accurately reflects the real world, which they know nothing about and can never directly observe. Yet that’s the purpose that multibillion-dollar tech companies are trying to harness them to serve. Why are they so focused on making this technology do something it’s inherently bad at?
Big Tech companies are hype-chasers by nature. In their glory days, each of them unveiled some innovation that changed society forever while making themselves very, very rich. Understandably, they want to pull off the same trick again, which is why they’re always on the hunt for the next big thing.
This makes them extremely susceptible to bandwagon behavior. They all fear being “disrupted” as they disrupted their predecessors. None of them want to be toppled by a competitor using some revolutionary technology they could have gotten into but didn’t. That’s why they’re all eager to jump on the buzzword of the moment, regardless of how useful or practical it is.
That’s what happened with blockchain and crypto. A marquee list of tech companies announced their “pivot” to the blockchain with great fanfare, promising it would change the world. Years later, no world-changing innovations have materialized, and some of the biggest crypto companies proved to be house-on-fire scams. Many companies quietly let their blockchain projects peter out when they proved useless for all practical purposes other than crime.
The same goes for the metaverse, another buzzy idea that tech companies burned tens of billions of dollars on building, then discovered that no one actually wanted it. Everyone wanted in on the metaverse only because they assumed everyone else wanted in on it. It was a shared delusion that evaporated like fog as soon as the technology was available for people to use.
AI is going through that same hype cycle. Here, at least, there are clear applications for the technology. The explosion of creative uses (some good and ethical, others not) for chatbots, computer-vision programs, and generative art engines proves that. However, corporations hungry for a new way to make money are yet again charging past what the tech is actually capable of.