"When we attribute human-like abilities to LLMs, we fall into an anthropomorphic bias ... But are we also showing an anthropocentric bias by failing to recognize" what they can do?
They don't have memories, although we can remind them through prompt engineering (summarise conversation, include summary in next prompt). They are goal-free and "can’t act in the physical world", although again clever prompting could make it look like they can.
But they do seem "to understand what we want. But to what extent do LLMs truly “understand”?"
The author uses a couple of excellent metaphors to answer why the M is in LLM, starting with weather modelling:
If weather models "capture the statistics of the atmospheric conditions that generate the weather", LLMs are "a statistical model of text" which, as they predict the next word, must be capturing something about the human brain processes which generate text. Is it understanding?
The author then compares how LLMs are trained with how children learn understanding and language. Unlike LLMs, children are embodied: in “Climbing Towards NLU” (Bender & Koller, 2020) argues that true natural language understanding (NLU) requires grounding in the real world."
What is understanding, anyway? "Ludwig Wittgenstein suggested that understanding is context-dependent and is shown through intelligent behaviour rather than mere possession of knowledge".
LLMs work like "Philosopher John Searle’s “Chinese Room” experiment", where someone has a conversation with a room containing "detailed instructions on how to respond to someone writing in Chinese" and someone inside who uses those instructions to respond in chinese to notes slid under the door. This room builder may understand Chinese but is not there, the person inside doesn't understand Chinese at all, so who is having the conversation? The room?
For humans "understanding comes with a subjective conscious experience of understanding. But it’s easy to see that this experience can be deceiving... Measuring ‘understanding’ is not straightforward": there are levels of understanding, as shown in problem-solving experiments measuring the understanding of animals. "Do we have that same level of understanding of LLMs to conduct similar experiments?"
How has the understanding of LLMs been measured to date?
Anthropomorphism and anthropocentrism both undermine our assessment of LLMs. According to "“Anthropocentric bias and the possibility of artificial cognition” (Millière & Rathkopf)... we use flawed tests...:
Another problem is LLMs tend to "repeat patterns seen in their training data", so it's quite easy to set up an adversarial test to demonstrate their lack of understanding (the author uses a Monty Hall experiment).
But maybe "somewhere in [there] ... is something we would call understanding... buried under a tendency to repeat memorised text?" After all, "humans have a very similar bias" when we use System 1 thinking to "respond quickly when we’ve identified a heuristic we can use to avoid thinking deeply about a problem... [so] the ability to be tricked can’t be sufficient evidence of poor understanding."
After comparing an LLM to his 2yr old daughter again, the author concludes:
More Stuff I Like
More Stuff tagged ai , understanding , cognitive bias , llm , embodiment , anthropomorphic , chinese room , anthropocentric
See also: Digital Transformation , Innovation Strategy , Science&Technology , Large language models
MyHub.ai saves very few cookies onto your device: we need some to monitor site traffic using Google Analytics, while another protects you from a cross-site request forgeries. Nevertheless, you can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings, you grant us permission to store that information on your device. More details in our Privacy Policy.