Jon Randy 🎖️

Posted on Jun 15, 2024

LLMs are Bullshitters 🐮💩

#ai #llm #chatgpt

Just read a great paper on LLMs. I strongly suggest reading it, but here's the conclusion:

Investors, policymakers, and members of the general public make decisions on how to treat these machines and how to react to them based not on a deep technical understanding of how they work, but on the often metaphorical way in which their abilities and function are communicated. Calling their mistakes ‘hallucinations’ isn’t harmless: it lends itself to the confusion that the machines are in some way misperceiving but are nonetheless trying to convey something that they believe or have perceived. This, as we’ve argued, is the wrong metaphor. The machines are not trying to communicate something they believe or perceive. Their inaccuracy is not due to misperception or hallucination. As we have pointed out, they are not trying to convey information at all. They are bullshitting.

Calling chatbot inaccuracies ‘hallucinations’ feeds in to overblown hype about their abilities among technology cheerleaders, and could lead to unnecessary consternation among the general public. It also suggests solutions to the inaccuracy problem which might not work, and could lead to misguided efforts at AI alignment amongst specialists. It can also lead to the wrong attitude towards the machine when it gets things right: the inaccuracies show that it is bullshitting, even when it’s right. Calling these inaccuracies ‘bullshit’ rather than ‘hallucinations’ isn’t just more accurate (as we’ve argued); it’s good science and technology communication in an area that sorely needs it.

ChatGPT is bullshit | Ethics and Information Technology

Recently, there has been considerable interest in large language models: machine learning systems which produce human-like text and dialogue. Applications of these systems have been plagued by persistent inaccuracies in their output; these are often called “AI hallucinations”. We argue that these falsehoods, and the overall activity of large language models, is better understood as bullshit in the sense explored by Frankfurt (On Bullshit, Princeton, 2005): the models are in an important way indifferent to the truth of their outputs. We distinguish two ways in which the models can be said to be bullshitters, and argue that they clearly meet at least one of these definitions. We further argue that describing AI misrepresentations as bullshit is both a more useful and more accurate way of predicting and discussing the behaviour of these systems.

link.springer.com

Top comments (6)

Fyodor • Jun 15 '24

These camps of fanatics from both sides are pathetic, to be honest 🤦‍♂️ LLM is just a tool — like a DB, or a UI library, or a toaster… A good tool BTW, with very interesting applications. And folks build other fun tools with and/or around them, and work on making the output better, and such stuff. I’m not talking about ChatGPT wrappers obviously, but there are so many cool open-source models, and agents, and tools, and other projects. Talking about bullshit in the output is like judging the Google search results — they’re just the results of the data and the algorithms put into them. Garbage in, garbage out. We can make them better. People already do. Without hype and exaltation, just for the sake of accepting the new software development challenges. Everything out there is full of lame marketing, but as usual, it’s possible to filter that out.

Jon Randy 🎖️ • Jun 15 '24 • Edited

Did you read the paper? They're using the term 'bullshit' for its actual meaning rather than to be derogatory.

There are potential dangers in mischaracterising what LLMs actually are, and what they do. Greater understanding is paramount to avoiding these dangers.

Red Ochsenbein (he/him) • Jun 15 '24

"We can make them better." I am afraid we can not. Not the way those models work at the moment. It already takes incredible amounts of exploitative work to enable what those models do today. To actually categorize data according to its factuality and accuracy is simply not possible because of the amount of data fed into those models AND the expertise required across ALL fields. And no, more technology can not solve this either. And there is another factor making it harder and harder for LLMs to improve in the future: the amount of data generated and published by LLMs. Studies have already shown models are deteriorating because of AI generated inputs. To weed out those will be pretty hard, not only because the LLM generated content will soon surpass all the texts humanity has ever written (if it hasn't already).

And even when all of this can be solved: LLMs are still just 'next word predicting' machines and 'social acceptance automatons' at that. They'll not be able to reason and will probably always generate bullshit.

Mike Talbot ⭐ • Jun 16 '24

It's a fascinating paper, and I agree with the principle of bullshit as the term to use for where it predicts answers based on them "looking like" the real-world data it has been trained on. That's definitely bullshit and it's definitely a fault of ChatGPT 3.x that is hard to avoid. I do think though that the paper is very focused on the production of text as the end goal of the systems being built - I think that this is an intermediate step towards generally processing and refining information, which is how as humans we use language to both hold and process concepts.

All the work I do with ChatGPT 4 is processing the information I give it, using its knowledge of language and the world to improve on finding links and concepts. When doing this it is reasonably easy to ensure that "bullshit" or unreal ideas and concepts are not found in the output. We've had a lot of success in reviewing, standardising and simplifying complex documents and initially, it looks like language translation of these concepts is near human-accurate, something not possible with simple translators like Google Translate.

Jan Küster 🔥 • Jun 15 '24

+1 for more science based discussions on dev.to 🙌

Oscar • Jun 16 '24

Going to read this very soon! Just based on the conclusion, it sounds like a very interesting paper and I'm curious to see what else the authors have to say about LLMs.